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PREFACE 

The  only  variation  from  current  practice  in  the  organization 
of  the  subject  matter  of  this  study  is  the  changing  of  the  position 
of  the  conclusions.  The  writer  has  stated  his  conclusions  at  the 
very  beginning  of  the  study.  This  was  done  to  enable  the  busy 
reader  to  see  at  a  glance  the  general  trend  of  the  study. 

As  the  practical  school  administrator  knows,  the  gathering  of 
the  kind  of  data  upon  which  this  study  is  based  is  a  matter  of 
some  difficulty.  It  is  the  kind  of  data,  however,  which  we  must 
have  if  science  is  to  aid  in  the  selection  of  teachers. 

Some  psychologists  may  contend  that  an  analysis  of  teaching 
rather  than  the  correlation  of  observable  facts  with  varying 
amounts  of  success  of  actual  teachers  is  the  only  correct  method 
of  determining  what  tests  will  distinguish  good  from  poor 
teachers.  No  one  would  deny  the  value  of  sagacious  insight  into 
any  problem  of  human  engineering.  So  far  neither  the  analysis 
method  nor  the  correlation  method  has  done  very  well  in  practice 
on  the  practical  job.  This  study  is  based  on  the  correlation 
method.  Its  shortcomings  should  not  be  confused  either  with 
the  logical  soundness  or  with  the  practical  superiority  of  test 
construction  on  a  basis  of  correlation  between  test  scores  and 
performance. 

Frederic  B.  Knight. 
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CONCLUSIONS  BASED  ON  THIS  STUDY 

This  thesis  deals  with  the  problems  of  isolating  the  significant 
and  measurable  qualities  of  effective  teaching  and  the  methods  of 
measuring  these  qualities.  It  is  a  continuation  of  similar  studies 
of  which  the  work  of  Meriam  was  the  first.  A  rating  of  153 
high-school  and  elementary-school  teachers  was  obtained  by  hav- 
ing the  teachers  rate  each  other  for  the  quality  of  general  teaching 
ability  and  other  traits.  While  it  may  be  said  that  the  teachers 
knew  each  other  only  in  a  social  way  and,  therefore,  could  not 
rate  each  other  for  general  teaching  ability,  the  data  show  that 
an  adequate  rating  can  be  procured  by  this  method. 

The  statistical  treatment  of  the  data  shows  that : 

1 .  Chance  halves  of  the  mutual  ratings  of  the  teachers  correlate 
with  each  other  -f  .899,  ±  .01. 

2.  The  mutual  ratings  of  the  teachers  correlate  with  the  ratings 
of  the  supervisors  -\-  .962,  ±  .001. 

3.  The  mutual  ratings  of  the  teachers  correlate  with  pupils' 
estimates  +.681,  ±.05, 

4.  There  is  also  substantial  agreement  among  the  mutual  rat- 
ings of  teachers,  when  they  rate  for  specific  traits,  such  as  intel- 
lectual strength  and  skill  in  discipline.  The  average  correlations 
between  chance  halves  of  the  ratings  are  respectively  + .  879, 
±.016,  and  +.838,  ±.023.  These  correlations  are  evidences 
of  the  ability  of  teachers  to  rate  each  other. 

The  ratings  for  general  teaching  ability,  secured  in  this  way, 
were  used  as  measures  of  teaching  merit,  against  which  objective 
facts  were  correlated.  The  correlations  between  general  teaching 
ability  and  age,  amount  of  experience,  quality  of  handwriting, 
intelUgence  as  measured  by  test,  major  academic  interests,  nor- 
mal-school scholarship,  amount  of  professional  study  during  active 
service,  and  ability  to  pass  a  professional  test  have  been  secured. 
The  correlations  are  too  low  to  warrant  one  in  using  these  factors 
for  prognostic  purposes,  except  ability  to  pass  a  professional  test 
(+.541),  normal-school  scholarship  (+.153)  and  intelligence 
(+.108).    By  using  the  coefficient  of  partial  correlation  we  find,  in 

viii 
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the  case  of  elementary  school  teachers,  that,  the  factors  of  intelli- 
gence and  normal-school  scholarship  being  constant,  there  is  a 
mutual  relationship  of  -f  .  57  between  ability  to  teach  and  ability 
to  pass  a  professional  test.  Professional  tests  may  be  used  to 
estimate  probable  success  in  teaching.  The  amount  of  profes- 
sional study  accomplished  during  active  service  is  also  indicative 
of  success  in  teaching.  The  number  of  teachers  who  had  accom- 
plished professional  study  of  this  sort  was  too  small  in  the  groups 
which  were  studied  to  allow  an  accurate  determination  of  the 
degree  of  significance  that  professional  study  has. 

In  the  case  of  high-school  teachers,  intellectual  differences,  as 
determined  by  mental  tests,  appear  to  be  significant.  For  the 
selection  of  high-school  teachers  the  use  of  mental  tests  would  be 
of  value. 

These  data,  as  a  whole,  may  be  interpreted  to  mean  that  the 
general  factor  of  interest  in  one's  work  becomes  the  dominant 
factor  in  determining  one's  success  in  teaching.  The  reasoning 
which  leads  to  this  conclusion  is  not  straightaway,  for  we  have 
not  as  yet  objective  tests  of  interest.  We  do  know,  however,  that 
other  measurable  traits,  either  alone  or  in  combinations,  are  not 
adequate  explanations  of  teaching  success.  With  our  present 
knowledge  it  is  reasonable  to  suppose  that  genuine  interest  in 
one's  work  accounts  for  a  large  part  of  teaching  success. 

In  the  second  part  of  the  study  data  are  presented  which  show 
the  spread  of  general  estimate  to  particular  traits,  when  judgments 
or  ratings  are  made.  For  example,  when  a  judge  attempts  to  rate 
a  teacher  in  some  particular  trait,  his  rating  is  a  defense  of  his 
general  estimate  of  that  teacher,  as  well  as  a  rating  of  the  trait 
under  consideration. 

The  mutual  judgments  of  teachers  for  the  trait,  intellectual 
ability,  correlate  with  their  judgments  of  general  teaching  ability 
-f-. 935,  ±.014. 

The  mutual  judgments  of  teachers  for  the  trait,  skill  in  disci- 
pline, correlate  with  their  judgments  for  general  teaching  ability 
+  .789,  ±.001. 

The  mutual  judgments  of  teachers  for  the  trait,  skill  indiscipline, 
correlate  with  their  judgments  for  intellectual  ability  +.863,=*= 
.080. 

It  would  be  difficult  to  hold  that  these  correlations  represent  the 
true  relationships  which  exist  between  these  pairs  of  traits.     The 
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presence  of  a  large  factor  of  spread  of  general  estimate  accounts 
best  for  the  size  of  these  correlations. 

A  study  of  the  correlations  between  the  ratings  of  126  teachers 
in  a  New  York  school  system  for  15  traits  showed  that  105  of  the 
120  correlations  studied  could  be  accounted  for  by  chance  varia- 
tion from  an  average  correlation,  even  if  a  perfect,  or  a  100  per 
cent,  spread  of  general  estimate  was  present. 

A  study  of  the  correlation  between  qualities  of  teaching  as 
presented  by  Boyce  in  his  work,  published  in  the  Fourteenth  Year- 
hook  of  the  National  Society  for  the  Study  of  Education,  shows  that 
85  per  cent  of  the  correlations  come  within  a  range  of  ±.150. 
These  facts  can  be  satisfactorily  explained  only  when  a  factor  of 
spread  of  general  estimate  is  allowed. 

It  seems  fair  to  conclude,  therefore,  that  in  judging  particular 
traits  general  estimate  influences  the  particular  estimate  to  such  a 
degree  that  judgments  of  particular  traits  are  in  themselves  of 
little  practical  use. 


CHAPTER  I 
INTRODUCTION 

This  thesis^  lies  in  the  field  of  research  which  is  concerned 
with  methods  of  rating  teaching,  of  determining  the  significant 
factors  in  teaching  ability,  and  of  measuring  objectively  such 
factors. 

This  field  of  research  is  by  no  means  a  virgin  one  nor  is  it  one  of 
academic  interest  only.  Practical  school  administrators  have  no 
more  important  and,  at  present,  no  more  troublesome  problems 
than  those  which  are  grouped  around  the  technique  of  selecting 
and  rating  teachers. 

Actual  isolation  of  such  factors  as  intellect  and  temperament, 
which  are  indispensable  to  successful  teaching,  and  the  discovery 
of  a  method  determining  whether  a  prospective  teacher  possesses 
the  indispensable  qualities  of  a  good  teacher  would  be  a'  boon  to 
school  administrators. 

During  the  past  fifteen  years  educators  and  psychologists  have 
given  their  earnest  attention  to  this  series  of  problems,  on  which  a 
great  deal  has  been  written  and  on  which  much  research  work  has 
been  done.  Three  studies  have  been  selected  to  show  the  general 
development  of  attempts  which  have  been  made  to  find  solutions 
to  different  phases  of  the  personal-management  problems  of  our 
public  schools. 

meriam's  study 

Dr.  L.  L.  Meriam,  in  a  research  study.  Normal  School  Education 
and  Efficiency  in  Teaching,  published  in  1906,  Teachers  College 
Contributions  to  Education,  No.  1,  Chapter  IV,  presented  data 
which  were  used  to  discover  the  correlation  between  teaching 
efficiency  and  scholarship  in  the  normal  school. 

"This  is  the  problem,"  said  Dr.  Meriam.  "Is  the  efficient 
teacher  the  proficient  scholar?  To  what  extent  is  he  so  in  each 
of  the  subjects  of  the  normal-school  course?  In  other  words, 
does  the  one  who  stands  high  among  fellow-teachers  stand  rela- 
tively high  among  fellow-students  in  the  work  preparatory  to  his 

^All  data  used  in  this  study  are  on  file  at  Teachers  College,  Columbia 
University,  New  York  City. 
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teaching?  Such  a  study  of  mental  relationships  is  in  itself  a  study 
of  causes.  If  it  be  found  a  rule  that  efficiency  in  teaching  follows 
proficiency  in  scholarship,  then,  other  things  being  equal,  the 
latter  may  be  considered  a  vital  contribution  to  the  latter.  And 
this  is  our  present  purpose:  to  discover,  so  far  as  possible,  what 
elements  enter  into  the  making  of  a  capable  teacher.  Corollary 
questions  are :  To  what  extent  does  proficiency  in  scholarship  mean 
efficiency  in  teaching?     .     .     ." 

In  Dr.  Meriam's  research  study  an  admirable  attempt  was  made 
to  find  out  the  relative  teaching  ability  of  a  large  number  (1,185) 
of  normal-school  graduates.  Equally  careful  work  was  done  to 
determine  the  relative  normal-school  success  of  these  graduates. 
Meriam  had  no  accurate  measure  of  teaching  efficiency  and  no 
reliable  measure  to  equate  the  amount  of  success  in  one  school 
system  with  the  amount  of  success  in  another.  He  encountered 
the  same  difficulty  in  interpreting  normal-school  marks  as  meas- 
ures of  scholastic  accomplishment.  Great  statistical  ingenuity 
was  shown  by  Dr.  Meriam  and  his  results,  by  all  odds,  were 
the  most  dependable  at  the  time  of  the  publication  of  his 
thesis. 

The  correlation  between  normal-school  standing  and  ability  or 
success  in  the  field  was  found  to  be  so  surprisingly  low  that  dif- 
ferences in  scholarship  among  students  in  the  normal  schools 
seemed  to  bear  a  negligible  relation  to  future  differences  in  teach- 
ing ability.  Meriam  found  that  practice  teaching  during  normal- 
school  training  was  slightly  prophetic  of  the  quality  of  teaching 
which  should  be  expected  after  graduation.  Examinations  con- 
taining professional  subject-matter  did  not  appear  to  furnish  a 
significant  index  of  an  individual's  ability  to  teach. 

The  statistical  difficulties  of  Meriam's  work  should  not  blind 
one  to  its  value.  It  clearly  stated  the  problem  of  correlating 
teaching  ability  with  factors  which  are  more  or  less  objective  and 
measurable.  It  developed  a  technique  of  research  that  was  sound 
in  theory.  It  exercised  much  influence  in  taking  the  problem  of 
teaching  efficiency  from  the  field  of  opinion  and  discussion  and  in 
placing  it,  where  it  properly  belongs;  namely,  in  the  field  of  re- 
search and  objective  measurement. 

Meriam's  more  important  findings  are  expressed  as  coefficients 
of  correlation  between  teaching  efficiency  and  scholarship  in 
normal-school  studies.     These  he  reports  as  follows: 
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Correlation  between  Teaching  Ability  and  Practice  Teaching + .  39 

Correlation  between  Teaching  Ability  and  Psychology + .  37 

Correlation  between  Teaching  Abihty  and  History  and  Principles  of 

Education +  .28 

Correlation  between  Teaching  Abihty  and  Method  Courses + .  29 

Correlation  between  Teaching  Abihty  and  Academic  Coiu-ses + .  22 

Meriam's  data  also  support  his  conclusion  that,  after  the  first 
year  of  teaching,  experience,  as  such,  has  little  if  any  influence 
on  the  improvement  of  teaching  efficiency. 

Elliott's  study 

In  1910  another  treatment  of  this  general  subject  was  published 
which  deserves  notice.  Dr.  Edward  C.  Elliott  presented  to  the 
second  annual  convention  of  city  superintendents  in  Wisconsin 
"A  Tentative  Scheme  for  the  Measurement  of  Teaching  Effi- 
ciency." This  score  card  has  been  revised  in  detail,  but  the  first 
scheme  included  all  the  essential  factors.  Elliott  stated  these 
three  propositions  which  were  of  more  than  temporary  importance: 

"Is  it  possible  to  devise  and  to  apply  to  the  teaching  process, 
impersonal,  quantitative  standards,  whereby  the  relative  worth 
and  efficiency  of  teachers  may  be  determined  more  justly  and  with 
greater  precision  than  under  the  ordinary  practices  of  the  day? 

"  Does  not  the  effective  organization,  administration,  and  super- 
vision of  public  schools  require  that  the  conditions  and  results  of 
the  teachers'  work  be  subjected  to  measurements  of  a  quantitative 
rather  than  a  qualitative  nature? 

"Is  it  possible  for  the  present  generation  to  make  any  reliable 
and  satisfactory  conclusions  concerning  the  direction  and  rate  of 
educational  progress  without  standards  of  value  resting  upon  a 
quantitative  basis?" 

The  scheme  divides  teaching  efficiency  into  seven  sections  and 
to  each  section  assigns  a  weight  or  value.  The  scheme,  in  sum- 
mary, is  here  reproduced: 

I.  Physical  Efficiency 12  points 

II.  Moral-nature  Efficiency 14       " 

III.  Administrative  Efficiency 10       " 

IV.  Dynamic  Efficiency 24       " 

V.  Projected  Efficiency 6       " 

VI.  Achieved  Efficiency 24       " 

VII.  Social  Efficiency 10      " 

Total 100      " 
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The  value  of  this  scheme  is  that  attention  is  directed  to  par- 
ticular traits  and  that  diagnosis  of  teaching  merit  is  stimulated. 
The  suggested  values  are,  of  course,  matters  of  opinion. 
The  assumption  that  analysis  of  the  teacher  and  of  the  judgment 
of  particular  qualities,  studied  in  isolation,  can  be  made  is  highly 
questionable. 

boyce's  study 

In  addition  to  Meriam's  study  the  only  other  research  of  ex- 
tensive nature  is  that  made  by  A,  C.  Boyce^  and  published  under 
the  title  "Methods  of  Measuring  Teachers'  Efficiency,"  Part  II 
of  the  Fourteenth  Year-Book  of  the  National  Society  for  Study  of 
Education.  Boyce  obtained  the  rating  of  a  great  many  teachers 
for  general  merit  and  for  specific  qualities.  Then,  by  a  method  of 
correlation,  he  worked  out  the  relative  significance  of  the  qualities. 
There  are  many  technical  improvements  in  this  study  over  that  of 
Meriam's,  but  the  general  procedure  is  the  same. 

For  fifteen  years  the  teaching  profession  has  been  sensitive  to 
problems  of  recruiting  new  members.  As  yet,  however,  no  one 
knows  the  exact  formula  for  success  in  teaching.  The  complexity 
of  personality  and  character  and  the  many-sidedness  of  teaching 
have  continually  baffled  useful  analysis.  We  know  that  several 
measurable  traits  are  not  essential  to  successful  teaching,  but  we 
do  not  know  what  traits  must  be  present  in  superior  instructors. 
The  inspiring  advance  in  the  application  of  psychological  methods 
to  the  selection  of  clerks,  stenographers,  machine  operators,  and 
fliers  in  industry  together  with  similar  success  in  vocational 
guidance  in  professional  education,  such  as  engineering  and 
dentistry,  increases  our  confidence  in  the  hope  that  before  long 
psychology  will  enable  school  administrators  to  select  teachers 
with  frequency  and  size  of  error  far  smaller  than  prevails  at 
present. 

1  For  a  discussion  of  this  study,  see  the  last  part  of  Chapter  V. 


CHAPTER  II 
METHOD  AND  DATA  INVOLVED  IN  THIS  STUDY 

An  accurate  rating  of  a  sufficiently  large  number  of  teachers  for 
general  teaching  ability  must  be  obtained  before  any  analysis  of 
the  significant  qualities  of  teaching  is  possible.  We  must  know 
who  the  good  teachers  are,  who  are  the  poor  teachers  and  who  are 
the  fair  teachers,  before  it  is  worth  while  to  attempt  to  find  out 
what  facts  are  pertinent  in  judging  their  teaching  skill.  After  we 
get  a  group  of  teachers  who  we  know  differ  among  themselves 
in  general  teaching  ability,  by  certain  amounts  or  units,  then  we 
may  proceed,  by  a  method  of  correlation,  to  find  out  what  facts 
about  them  are  of  prognostic  or  diagnostic  value. 

Such  a  rating  of  general  teaching  ability  for  156  grade  and 
high-school  teachers  who  were  at  work  in  the  public  schools  of 
Towns  A,  B,  and  C,  in  Massachusetts,  during  the  school  year 
1918-1919,  has  been  obtained.  There  were  six  groups  of  teachers. 
Three  of  these  groups  were  the  grade  teachers  in  Towns  A,  B,  and 
C,  and  three  were  the  high-school  faculties  in  these  towns.  The 
number  of  teachers  in  each  group  follows: 

Grade  teachers 53 

High-school  teachers 15 

Grade  teachers 35 

High-school  teachers 13 

Grade  teachers 30 

High-school  teachers 10 


Town  A 
Town  B 
TownC 


Three  separate  ratings  for  general  ability  in  teaching  were  ob- 
tained for  each  group.  One  rating  was  secured  from  the  supervi- 
sors in  each  system  for  their  respective  teachers.  Another  was 
secured  by  the  mutual  judgments  of  the  teachers  themselves  of  each 
group.     Another  was  secured  by  a  consensus  of  pupils'  opinions. 

In  general  the  method  which  was  used  in  deriving  the  ratings 
was  to  have  the  several  judges  rate  each  teacher  in  the  group  rela- 
tive to  the  other  members  of  the  group  for  the  broad  quality 
general  ability  as  a  teacher.  The  theory  which  underlies  this 
method  is  this:  Where  direct  measurement  in  terms  of  amount  is 
impossible,  measurements  by  relative  position  in  a  series  may  be 
so  controlled  that  possibly  as  exact  and  as  true  ratings  may  be  ob- 
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tained  as  if  units  of  amount  had  been  used.  It  is  assumed  that 
the  amount  of  difference  between  two  teachers  who  have  been  thus 
judged  will  depend  on  the  ease  with  which  the  differences  are  ob- 
served by  competent  judges. 

In  using  this  general  method  of  rating  teachers  for  teaching 
ability,  we  have  taken  it  for  granted  that  the  good  teacher  is  the 
one  whom  competent  judges  rate  as  good.  We  hold  throughout 
this  study  that  the  poor  teacher  is  the  one  whom  the  judges  have 
rated  as  poor.  These  hypotheses  will  presumably  be  acceptable 
to  those  who  are  familiar  with  the  theory  and  practice  of  social 
measurements.  It  may  be  admitted  that  one  could  question  the 
final  truth  that  the  opinions  of  any  number  of  judges,  however 
competent  and  harmonious  they  might  be,  necessarily  establish 
the  facts  of  teaching  merit.  Thus  the  really  good  teacher,  it  might 
be  held,  is  the  one  who  gets  her  pupils  on  fastest. 

To  determine  how  much  the  progress  of  pupils  is  due  to  any  one 
teacher  is  not  possible  by  any  method  or  information  that  is  as  yet 
available.  Even  to  measure  a  pupil's  total  progress,  much  less 
the  total  progress  of  a  class,  is  as  yet  a  little  venturesome. 

It  might  also  be  held  that  the  amount  of  development  of  char- 
acter and  morality  in  the  pupils  is  the  only  test  of  good  teaching 
and  that  what  others  think  about  the  teacher  is  really  irrelevant. 

To  hold,  on  the  other  hand,  that  competent  judgments  of  teach- 
ers, when  properly  combined,  will  give  a  very  useful  and  approxi- 
mately true  rating,  as  well  as  probably  the  best  rating  method 
that  is  now  available,  is  only  common  sense.  This  rating  method 
is  entirely  defensible. 

The  good  lawyer,  after  all,  is  the  one  w^ho  is  considered  a  good 
lawyer  by  fellow-members  of  the  bar.  The  poor  dentist  is  the  one 
to  whom  no  other  dentist  would  go  or  recommend  anybody  else. 
The  great  preacher  is  the  one  who  attracts  visitors.  The  good 
teacher  is  the  teacher  who  is  thought  to  be  good. 

Where  differences  in  skill  among  employed  people  must  be  de- 
termined, judgments  in  terms  of  hetter  than  the  average,  poorer  than 
one's  associates,  and  similar  expressions,  are  useful  measures  of 
ability.  Of  course,  the  final  vahdity  of  the  judgments  may  be  les- 
sened by  the  presence  of  constant  error  in  the  opinion  that  it  of- 
fered, or  by  the  incompetence  or  paucity  of  the  opinions  expressed, 
or  by  the  failure  properly  to  combine  the  judgments  after  they 
are  obtained. 
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Of  the  three  ratings, — by  the  teachers  themselves,  which  is 
labeled  "A,"  by  the  supervisors, "  B,"  by  the  pupils  "  C," — we  shall 
take  up  first  the  ratings  of  the  teachers  which  are  indicated  by  the 
judgments  of  their  fellow-teachers. 

PROCESS  OF  RATING  TEACHERS,  BASED  UPON  THE  TEACHERS* 

ESTIMATES 

Step  1 .  Teachers'  meetings  for  each  group  were  called  and  the 
teachers  were  asked  to  rate  each  other  for  general  teaching  ability, 
using  the  relative-position  method.  The  ratings  were  not  in 
terms  of  good,  fair,  poor,  because  what  one  teacher  might  consider 
good,  another  teacher  who  was  more  critical  might  consider  only 
fair.  This  type  of  difference  might  run  through  the  series  of 
judgments. 

The  ratings  were  not  secured  in  terms  of  how  much  below  the  best 
teacher  you  have  ever  known,  or  the  equivalent  expressions,  for  er- 
rors of  an  obvious  nature  are  bound  to  creep  into  any  such  rating: 
system.  The  ratings  were  all  given  in  terms  of  relative  position 
within  the  group  itself.  Thus,  when  the  grade  teachers  of  Town  A, 
rated  each  other,  every  teacher  placed  in  order  of  merit  all  the 
teachers  in  the  Town  A  group.  The  amount  of  difference  between! 
the  teachers  in  the  final  rating  was  determined  by  combining  all 
the  judgments  of  the  teachers.  Each  member  of  the  six  groups  of 
teachers,  while  in  a  teachers'  meeting,  rated  those  in  the  group  to 
which  she  belonged  in  a  similar  fashion  and  under  similar  condi- 
tions with  the  same  instructions.  The  instructions  which  were 
given  to  the  teachers  follow: 

INSTRUCTIONS  TO  TEACHERS 

On  this  sheet  you  are  requested  to  give  certain  ratings  of  each  teacher  in  the 
list,  including  yourself.  Please  rate  every  teacher  and  please  be  absolutely 
frank  in  your  ratings.  You  need  not  sign  your  name.  Nobody  will  ever  know 
how  you  or  anybody  else  rated  him.  No  personal  use  will  ever  be  made  of  any 
of  these  ratings.  They  will  be  used  in  a  purely  scientific  study  to  determine  the 
significance  of  age,  education,  early  interests,  etc.,  etc.,  for  success  as  a  teacher. 
The  names  will  all  be  cut  off  and  destroyed  as  soon  as  the  different  items  in  the 
inquiry  have  been  numbered  to  fit  the  ones  to  whom  they  refer.  Also,  do  not 
feel  disturbed  because  in  each  respect  somebody  has  to  be  rated  lowest.  These 
ratings  are  all  relative,  and  the  lowest  teacher  in  the  group  may  well  be  of  very 
great  abiUty.  Please  be  siure  to  record  ratings,  even  if  they  seem  to  you  to  be 
little  better  than  mere  guesses.  The  opinions  of  twenty  men  give  a  useful  rat- 
ing, even  if  any  one  of  the  twenty  taken  alone  is  almost  worthless. 
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On  the  sheet  is  a  list  of  the  teachers.  Choose  the  teacher  of  greatest  teaching 
ability  and  write  1  after  his  or  her  name  in  Column  1.  Choose  the  teacher  next 
below  in  teaching  abiUty  and  write  2  after  his  or  her  name  in  Column  1.  Write 
3  after  the  name  of  the  one  next  below  in  teaching  abiUty,  and  do  so  for  4,  5,  6, 
etc.  If  two  or  more  seem  absolutely  equal  in  teaching  ability  give  them  the 
same  rating.^ 

After  the  teachers  had  read  the  instructions  carefully  a  few 
minutes  were  allowed  them  for  asking  any  questions  that  might 
occur.  When  it  was  clear  that  the  teachers  understood  what  was 
wanted  of  them,  they  proceeded  with  the  rating.  No  names 
were  signed  to  the  rating  sheets.  It  was  evident  that  honest  and 
sincere  opinions  were  expressed.  The  resultant  ratings  of  each  of 
the  six  groups  of  teachers  were  then  examined.  Those  sheets 
which  were  incomplete  or  did  not  sufficiently  distribute  the  ratings 
were  discarded.  This  lack  of  usable  material  was  not  at  all 
great. 

The  teachers  found  that  rating  each  other  was  a  method  of 
polite  gossip  and  was  evidently  more  or  less  enjoyable.  For  each 
set  of  teachers  sufficient  material  was  obtained.  The  spread  or 
range  between  the  poorest  and  the  best  teacher  was  large.  In 
many  cases  it  was  as  great  as  the  number  of  teachers  involved. 
The  number  of  useful  ratings  (97)  were  distributed  as  follows : 

Grade  teachers 30  ratings 

High-school  teachers 14 

Grade  teachers 16 

High-school  teachers 10 

Grade  teachers 18 

High-school  teachers 9 

Total 97 


Town  A 
Town  B 
Town  C 


Step  2.  Each  set  of  ratings  was  then  divided  into  two  halves  by 
chance  drawings.  Each  half  has  been  treated  separately  through- 
out this  study.  These  halves  will  be  referred  to  as  Group  A  and 
Group  B.  The  carrying  of  two  groups  makes  corrections  for  at- 
tenuation in  the  correlations  and  shows  also  the  reliability  of 
the  judgments. 

A  transcript  (see  Table  I)  of  fifteen  of  the  ratings  for  general 
teaching  ability  of  Town  A  grade  teachers  has  been  made.  These 
ratings  compose  one  group  (Group  B)  of  the  mutual  judgments 
which  is  treated  later  to  get  a  single  rating  of  teachers.     The 

*  The  complete  instructions  are  given  on  pp.  46-48. 
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columns  include  the  complete  ratings  of  fifteen  different  judges. 
Thus,  by  looking  at  the  first  column  one  will  see  that  one  judge 
rated  Ah —  as  the  21st  best  teacher  of  the  group,  Dr —  as  the  6th 
best,  Sm —  as  the  best,  Hi —  as  the  3rd,  El —  as  the  21st,  and  so  on. 

The  numbers  opposite  each  teacher's  name  show  the  ratings 
received.  Thus,  Ah —  was  rated  by  one  teacher  as  21st,  by  an- 
other as  2nd,  by  another  as  1st,  by  another  as  11th,  by  another  as 
1st,  and  so  on.  In  this  way,  we  have  approximately  750  ratings 
of  teachers  in  a  group  in  terms  of  relative  worth.  The  ratings  by 
fifteen  other  teachers  of  the  teachers  of  this  group  were  similarly 
obtained  and  transcribed. 

We  have  the  judgments  of  every  teacher  on  every  teacher. 
Occasionally  a  judge  failed  to  rate  some  teacher.  This  was  due 
to  the  lack  of  acquaintance  with  that  teacher.  This  kind  of  omis- 
sion is  an  index  of  thoughtful  estimate,  because  it  indicates  the 
fact  that  a  judge  who  had  no  opinion  gave  no  rating  rather  than 
record  a  mere  guess.  The  trustworthiness  of  these  judgments  will 
be  discussed  later.  ^ 

Step  3.  We  now  have  the  relative  ranking  of  each  teacher  in 
the  opinion  of  fifteen  judges.  This  is  a  chance  half  of  all  of  the 
ratings;  namely,  Group  B.     Group  A  was  similarly  obtained. 

The  next  step  is  to  combine  these  two  groups  of  ratings  into  a 
single  rating.  The  theory  which  underlies  the  procedure  requires 
some  explanation.  We  know  several  facts  about  the  resultant 
and  combined  rating,  even  before  we  make  it. 

First,  the  final  relative  arrangement  will  not  be  the  result  of  any 
one  individual  judgment,  since  it  will  be  the  product  of  the  ratings 
of  fifteen  judges.  The  bias  of  any  single  person  who  has  served  as 
a  judge  will  not  operate  unduly  to  influence  the  final  result.  The 
fifteen  sets  of  ratings  were  chosen  by  chance  and  chance  errors  of 
overestimation  or  underestimation  of  any  teacher,  because  of  the 
particular  friendship  of  or  dislike  for  any  teacher,  by  a  particular 
judge,  will  be  offset  by  opposite  chances. 

Second,  except  for  a  negligible  chance  in  the  drawing  of  the 
fifteen  ratings  for  Group  A  or  Group  B,  there  will  be  no  constant 
error,  due  to  the  fact  that  the  judges  may  know  some  of  the  teach- 
ers very  well  and  others  only  slightly.  No  one  judge  will  know  all 
the  teachers  equally  well;  but  the  teachers  who  are  well  known 

*Seepagel7. 
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TABLE  I 

A  Transcript  of  Fifteen  Ratings  for  General  Teaching  Ability 
Group  B,  Town  A,  Elementary  Grade  Teachers 


1 

Ratings  by  Judges 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

1 

Ah— 

21 

2 

1 

11 

1 

21 

1 

1 

1 

1 

2 

11 

2 

7 

2 

Dr— 

6 

1 

3 

1 

10 

2 

2 

2 

1 

3 

52 

2 

3 

Sm— 

1 

3 

1 

1 

2 

18 

2 

2 

1 

5 

46 

1 

4 

Hi- 

3 

15 

16 

3 

4 

2 

1 

7 

19 

3 

5 

El— 

211 

3 

8 

25 

2 

26 

1 

2 

2 

4 

2 

2 

15 

22 

8 

6 

Co— 

23 

5 

5 

1 

8 

1 

6 

2 

3 

22 

1 

34 

32 

20 

7 

Sy- 

15 

2 

13 

1 

17 

2 

1 

2 

4 

24 

2 

21 

16 

. . 

8 

Pa— 

11 

14 

6 

17 

1 

13 

1 

4 

2 

3 

12 

5 

1 

11 

6 

9 

Sp- 

10 

14 

10 

14 

1 

18 

11 

8 

2 

2 

2 

3 

6 

15 

2 

10 

Le— 

28 

1 

2 

1 

11 

7 

2 

4 

2 

1 

17 

17 

15 

1  It  was  proper  to  give  two  teachers  the  same  rating  if  their  abihty  seemed 
equal. 


Better-Worse  Judgments  Based  on  Table  I 
Read: 

1  is  judged  worse  than  2,  7  times  with  0  equal  or  tied  votes;  1  is  judged  worse 
than  3,  6  times  with  1  equal  or  tied  votes,  and  so  on, 

a    b     c  d  6     f    ff  ^ 

120  —  7  562  —    6 

31  —  6  73  —    6 

4     1  —  4 


7 
8 
9 

2 
4 

2 

—  4 

—  6 

—  6 

8 

9 

10 

2 
2 
3 

—  7 

—  5 

—  6 

451     —    2  893    —    6 

61     —    2  10    3    —    4 

73    —    2  12     1    —    5 

Columns  a,  b,  e,  f  refer  to  teachers  by  number.     Columns  c,  g,  record  tied 
votes.     Columns  d,  h,  record  worse  votes. 
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to  some  of  the  judges  will  be  those  who  are  not  so  well  known  to 
others.  Then,  too,  those  teachers  who  are  little  known  to  some 
judges  will  be  well  known  to  others.  Thus,  intimate  knowledge 
and  lack  of  knowledge  on  the  part  of  those  who  are  judging  the 
teachers  will  be  somewhat  evenly  spread  over  the  whole  Ust. 

Third,  a  fairly  minute  scale  will  be  possible,  because  the  ratings 
are  spread  about  as  widely  as  the  number  of  teachers  who  have 
been  judged.  Many  of  the  ratings  which  were  presented  by  the 
teachers  had  a  spread  of  over  40,  and  the  number  of  teachers  who 
had  been  judged  was  52. 

In  any  scale  which  is  based  on  relative  position,  it  is  well  known 
that  absence  of  personal  bias,  absence  of  constant  error,  and  a 
large  number  of  judges  are  the  chief  desiderata.  In  these  ratings 
such  desiderata  are  present. 

THE    LOGIC    OF    THE    METHOD 

We  must  have  at  our  command  a  technique,  not  only  of  chang- 
ing measures  of  relative  position  into  measures  of  units  of  amount, 
but  also  of  combining  incomplete  judgments  of  relative  positions 
into  units  of  amount.  Our  problem,  in  simplified  form,  is  to 
change  differences  which  are  noticeable  to  competent  judges  into 
differences  of  units  of  amount.     A  concrete  application  follows: 

If,  in  the  opinion  of  judges,  person  A  is  better  than  person  B,  we 
must  find  out  how  much  better.  Suppose  that,  in  j  udging  A  and  B, 
we  have  ten  opinions.  If  five  of  the  judges  think  that  A  is  better 
than  B  and  five  think  that  B  is  better  than  A,  then  we  are  justi- 
fied in  calling  the  matter  a  draw  and  rule  that  A  and  B  are  equal. 

Suppose,  however,  that  six  think  A  better  than  B,  and  only  four 
think  B  better  than  A,  then  we  are  justified  in  holding  that  A  is 
better  than  B.  The  question  now  is:  By  how  much  is  A  better 
than  B?  The  percentage  of  judges  who  notice  a  difference  be- 
comes the  basis  of  our  procedure. 

It  is  reasonable  to  suppose  that  differences  which  are  noticed 
equally  often  are  equal  in  amount.  We  assume  that  all  judgments 
are  of  equal  value.  Further,  we  arbitrarily  define  as  one  unit  of 
difference  that  difference  which  75  per  cent  of  the  judges  notice. 

Thus,  if  100  judgments  are  made  comparing  A  and  B  and  75 
vote  A  to  be  better  than  B,  then  A  is  better  than  B  by  one  unit. 

When  the  data  are  incomplete,  the  only  thing  to  do  is  to  com- 
pare those  judgments  which  are  complete  and  disregard  the  judg- 
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ments  which  are  not  paralleled  with  similar  judgments  of  the  per- 
son compared. 

As  an  illustration,  let  us  turn  back  to  the  judgments  of  the 
Town  A  grade  teachers  (Table  I) .  In  comparing  Ah —  with  Dr — , 
we  shall  neglect  the  fifth  judgment  of  Ah —  because  it  is  not 
paralleled  with  one  of  Dr — ,  but  we  shall  not  neglect  it  in  compar- 
ing Ah —  with  El — ,  for  both  have  ratings  given  by  the  same  judge. 

We  also  give  to  the  teacher  who  has  been  rated  the  lowest  an 
arbitrary  value  of  one  unit.  From  this  as  a  base  we  build  up  the 
values  of  the  series.  It  is  obvious  that  the  more  judgments  there 
are,  the  better  will  be  the  final  rating.  The  more  competent  the 
judges  are,  the  more  dependable  the  final  rating  will  be. 

These  marks  will  be  used  as  measures  only  in  respect  to  their 
differences.  Thus  there  is  a  difference  of  10  units  between  a 
teacher  rated  15  and  one  rated  25.  But  it  would  not  be  correct 
to  think  of  a  teacher  rated  20  as  twice  as  efficient  as  one  rated  10  or 
half  as  good  as  one  rated  40.  These  quantitative  measures  of  ef- 
ficiency cannot  be  compared  on  a  basis  of  multiplication  or  divi- 
sion. We  are  interested  in  the  differences  and  in  no  other  mathe- 
matical relations.  Thus,  if  teachers  A,  B,  C,  are  rated  10,  20,  30, 
we  know  that  they  vary  in  ability  by  equal  amounts.  For  our 
purposes,  it  would  make  no  difference,  if  the  measures  were  110, 
120, 130,  or  610,  620,  630,  or  750,  760,  770,  as  long  as  the  quantita- 
tive differences  should  be  preserved. 

If  we  know  where  the  true  zero  of  teaching  efficiency  is,  in  rela- 
tion to  the  values  10,  20,  and  30,  then,  of  course,  other  mathemat- 
ical relations  can  be  immediately  worked  out;  but  we  do  not  know 
where  the  true  zero  point  lies. 

APPLICATION    OF    THE    METHOD 

In  computing  the  final  scale,  we  begin  by  making  a  rough  ap- 
proximation of  the  probable  order  by  inspection  or  by  computing 
the  median  rank  of  each  teacher  for  her  fifteen  ratings.  From  an 
inspection  of  the  Town  A  grade  ratings,  it  is  easily  seen  that  Ah — 
will  be  better  than  Co — ,  for  example,  since  Ah — 's  median  rating  is 
6,  while  Co — 's  is  11  +  .  The  rough  approximation  is  then  refined. 
This  approximation  is  done  by  comparing  each  teacher  with  the 
one  next  above  and  next  below  and  by  continuing  the  process  for 
several  places  each  way.  Usually  three  places  will  be  sufficient. 
This  refinement  is  continued  by  finding  the  percentages  of  the 
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judgments  which  are  in  favor  of  each  teacher  and  the  percent- 
ages of  the  judgments  which  rate  numerically  lower  each  teacher 
in  comparison  with  those  near  her.  In  every  case  only  those 
judgments  are  used  in  which  both  teachers  who  are  compared  are 
also  rated.  The  approximate  arrangement,  in  some  cases,  will 
be  wrong.  Then  the  teachers  must  be  shifted  in  order.  By  using 
the  unit  values  which  have  been  calculated  from  the  table  that 
corresponds  to  the  percentage  differences,  a  scale  of  amount  of 
difference  between  the  teachers  can  be  built  up. 

In  actual  procedure  it  is  better  to  begin  with  the  worst  teacher 
and  work  up.  A  procedure  of  trial  and  success,  with  frequent 
shiftings  back  and  forth,  will  be  found  more  economical  of  time 
and  patience  than  a  more  complicated  method. 

Referring  back  to  Table  I,  we  find  that  in  comparing  Ah —  with 
Dr —  the  fifth  rating  must  be  disregarded,  as  it  is  incomplete. 
In  seven  cases  Ah —  is  higher  numerically  or  worse  actually  than 
Dr — .  There  are  twelve  usable  judgments  in  all.  This  gives  7 
out  of  12  votes,  as  it  were,  against  Ah — .  The  percentage  is  then 
found  and  its  corresponding  unit  value.  It  is  well  to  determine 
the  percentage  differences,  not  only  between  a  teacher  and  the 
next  one  to  her,  but  also  for  three  teachers  away.  In  this  way  any 
individual,  if  rightly  placed,  will  be  above  not  only  the  one  next 
lower,  but  also  above  the  three  next  in  order.  Where  mistakes  in 
order  occur,  then  there  must  be  some  shifting. 

Two  exceptions  should  be  noted:  Tie  ratings  are  split,  and 
when  there  is  a  100  per  cent  agreement  that  one  teacher  is  better 
than  the  next,  then  there  is  theoretically  an  infinite,  or  at  least  an 
unknown,  amount  of  difference  between  them.  In  this  case 
commonsense  is  as  good  a  guide  as  any.  Either  the  statistician 
can  make  two  or  three  indirect  comparisons  through  other  teach- 
ers' ranks  in  comparison  with  the  two  in  question,  or  he  can  as- 
sign a  value  to  100  per  cent  a  little  larger  than  that  assigned  to  99 
per  gent.  The  latter  procedure  was  followed  here  as  the 
amount  of  difference  between  a  99  per  cent  to  1  per  cent  vote  is 
3.45  units.  We  arbitrarily  assign  4.00  to  100  per  cent  to  0  per 
cent  comparison. 

While  no  doubt  the  explanation  seems  complex,  the  procedure, 
if  followed  as  described  above,  will  readily  yield  a  well-made 
scale.  Turning  back  to  the  transcript  of  ratings  (see  Table  I), 
which  were  made  by  the  Town  A  teachers,  we  find  that  the  "  Bet- 
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ter-Worse"  columns  give  in  detail  the  comparisons  in  judgments. 
It  reads:  1  is  worse  than  2  (with  no  equal  or  tied  votes)  7  times; 
1  is  worse  than  3  (with  one  equal  or  tied  vote)  6  times.  Of  course, 
when  the  worse  votes  are  less  than  the  better,  the  table  is  used  in 
the  same  way. 

Table  II  shows,  in  part,  the  amount  of  difference  in  percentages 
from  the  worst  to  the  best,  with  comparisons  two  or  three  places 
removed  for  teachers  of  Group  A,  grade  teachers.  Town  A.  The 
numbers  down  the  table  and  across  are  key  numbers  to  the  teach- 
ers' names,  as  has  been  noted  in  Table  I.  The  table  reads: 
Teacher  2  is  better  than  teacher  1  by  58  per  cent;  teacher  2  is 
better  than  teacher  3  by  61  per  cent;  teacher  1  is  better  than 
teacher  3  by  54  per  cent;  teacher  8  is  better  than  teacher  7  by  57 
per  cent;  and  so  on. 

TABLE  II 

Amount  of  Difference  in  Percentages 
Group  A,  Town  A  Grade  Teachers 


1 

3 

6 

4 

8 

7 

5 

10 

12> 

62 

58 

12' 

50 

57 

12' 

54 
64 

19 

73 
64 
67 
58 
55 
44 
54 

12^ 

57 
50 
53 

162 

45 

16' 

54 
57 

2 

2   

58 

61 
54 

73 

84 
65 
58 
68 

81 

75 
50 
50 
57 

77 
75 
73 
77 
54 
54 

54 
65 

1 

3 

6 

4 

8 

7 

5 

10 

121 

122 

12' 

19 

Step  4.  The  final  step  in  getting  a  quantitative  ranking  is  to 
change  the  percentage  differences  into  amounts  of  difference.  Here 
we  use  the  table  of  percentage  differences  (see  Table  III).  The 
table  shows  the  unit  values  that  correspond  to  percentage  dif- 
ferences and  is  reproduced  from  Thorndike's  Mental  and  Social 
Measurements,  page  123. 
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TABLE  III 

The  Amounts  of  Difference  (x—y)  Corresponding  to  Given  Percent- 
ages OF  Judgments  that  x  >  y. 

%  r=THE  Percentage  of  Judgments  that  x  >y.    A/P.E.=x— r/,  in  Multi- 
ples OF  the  Difference  such  that  A%  r  is  75. 


%r 

A/P.E. 

%r 

A/P.E. 

%r 

A/P.E. 

%r 

A/P.E. 

%r 

A/P.E 

50 

.00 

60 

.38 

70 

.78 

80 

1.25 

90 

1.90 

51 

.04 

61 

.41 

71 

.82 

81 

1.30 

91 

1.99 

52 

.07 

62 

.45 

72 

.86 

82 

1.36 

92 

2.08 

53 

.11 

63 

.49 

73 

.91 

83 

1.41 

93 

2.19 

54 

.15 

64 

.53 

74 

.95 

84 

1.47 

94 

2.31 

55 

.19 

65 

.57 

75 

1.00 

85 

1.54 

95 

2.44 

56 

.22 

66 

.61 

76 

1.05 

86 

1.60 

96 

2.60 

57 

.26 

67 

.65 

77 

1.10 

87 

1.67 

97 

2.79 

58 

.30 

68 

.69 

78 

1.14 

88 

1.74 

98 

3.05 

59 

.34 

69 

.74 

79 

1.20 

89 

1.82 

99 
UOO 

3.45 
4.00 

Arbitrarily  taken. 


TABLE  IV 

Differences  in  Amount  of  Difference  for  the  Lowest  Thirteen 
Teachers  in  Town  A,  Grades,  as  Judged  by  Group  A 


No. 

Name 

Difference 
OF  Amount 

Per  Cent 

Amount  of 

Per  Cent  by 

Table  III 

49 

Cd— 

Rd— 

Dn— 

Rt— 

De— 

CI— 

Pm— 

Fm— 

Jr— 

By- 

Me— 

My— 

Ws— 

1.00 
1.37 
1.37 
1.86 
2.32 
2.69 
2.88 
3.03 
3.21 
3.21 
3.71 
4.12 
4.34 

0 
60 
50 
63 
62 
60 
55 
54 
55 
50 
63 
61 
56 

50 

.376 

471 

.000 

47» 

.492 

43* 

.453 

43» 

.376 

43» 

.186 

43» 

.149 

41 

.186 

371 

.000 

40 

.492 

375 

.414 

31« 

.224 
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Table  IV  shows  the  differences  in  amount  of  difference  for  the 
lowest  thirteen  teachers  in  the  grades  of  Town  A  as  judged  by 
Group  A. 

In  this  way  we  build  up  a  scale  of  teaching  ability  in  terms  of 
amount,  based  upon  fifteen  sets  of  judgments.  In  actual  practice 
one  need  not  carry  the  differences  in  terms  of  amount  to  more  than 
the  first  decimal  place.  This  scale  may  now  be  used  as  a  rating 
scale  of  the  teachers  with  which  to  correlate  any  other  significant 
scale  of  those  teachers.  Thus  we  may  take  their  ages  and,  by 
correlation,  find  out  whatever  influence  age  may  appear  to  have, 
or  we  may  take  professional  training,  salary,  etc. 

POSSIBLE  ERRORS 

In  constructing  a  scale  of  this  kind  there  is  the  possibility  that 
two  types  of  errors  may  be  met;  namely,  constant  errors  and 
variable  errors. 

For  constant  errors  we  cannot  compensate.  A  constant  error 
would  be  a  universal  tendency  on  the  part  of  the  judges,  for  ex- 
ample, to  rate  high  those  teachers  who  were  graduated  from 
Harvard  and  to  rate  low  those  teachers  who  were  graduated  from 
Yale,  because  it  might  popularly  be  thought  that  graduation 
from  Harvard  signified  something  that  graduation  from  Yale 
did  not,  when  in  point  of  fact  it  makes  no  difference  which  college 
the  teachers  had  attended.  Another  more  naive  type  of  constant 
error  would  be  to  rate  high  certain  teachers  because  they  are 
blondes  and  to  rate  low  other  teachers  because  they  are  brunettes, 
when  complexion  is  not  a  determining  factor.  If  all  judges  should 
consistently  err  in  some  such  ways  as  these,  then  we  would  have  a 
constant  error. 

Variable  errors  are  entirely  taken  care  of  statistically.  Errors 
of  this  type  operate  when  the  judges  do  not  know  all  teachers 
equally  well.  They  also  occur  in  reporting  clerical  mistakes  that 
are  made.  These  variable  errors  balance  each  other  and  by 
proper  treatment  are  either  eradicated  or  at  least  exposed. 

The  most  frequent  suspicion  of  the  validity  of  this  method  of 
rating  is  due  to  the  feeling  that  teachers  do  not  know  each  other 
and,  therefore,  cannot  judge  each  other.  Teachers,  however, 
receive  a  fairly  good  idea  of  each  other  through  conversations,  the 
remarks  of  pupils,  general  reputation,  appearance  in  the  halls, 
teachers'  meetings,  and  other  sources. 
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While  it  is  quite  probable  that  teachers  do  not  know,  in  minute 
particulars,  whether  other  teachers  are  good  or  poor,  it  would  be 
unwise  to  claim  that  teachers  do  not  know  pretty  well  whether 
their  associates  are  successful  or  not. 

We  have  seen  how  a  quantitative  ranking  of  the  grade  teachers 
of  Town  A  by  a  chance  half  of  the  judgments  was  obtained.  The 
quantitative  ranking  of  the  same  teachers  by  another  chance  half 
of  the  judgments  was  similarly  made.  For  the  high-school  teach- 
ers of  Town  A,  for  the  grade  and  high-school  teachers  of  Town  B 
and  Town  C,  respectively,  exactly  the  same  computations  were 
made.  We  have,  then,  two  sets  of  rankings  for  the  teachers  of 
these  six  groups. 

THE    DEPENDABILITY    OF    THE    DATA 

If  there  is  high  agreement  between  two  chance  halves  of  the 
judges,  such  an  agreement  is  evidence  of  the  reliability  of  the 
data.  The  following  paragraphs  are  concerned  with  establishing 
the  reliability  of  the  judgments  upon  which  the  ratings  are 
based. 

AGREEMENT    BETWEEN    TWO    GROUPS    OF    TEACHERS    WHO    JUDGE 
THEMSELVES   FOR   GENERAL   TEACHING   ABILITY 

The  agreement  between  Groups  A  and  B  for  general  teaching 
ability  is  shown  by  the  following  correlations: 

Correlations 

Grade  teachers  (53) +941,  db  .016 

High-school  teachers  (15) +  .894,  ±  .05 

Grade  teachers  (35) +  .906,  it  .03 

High-school  teachers  (13) +.894,  ±.05 

Grade  teachers  (30) +.813,  ±.06 

High-school  teachers  (10) + .  664,  ± .  17 

These  are  raw  correlations.  The  highest,  +.941,  was  from  the 
largest  group,  the  next  highest  from  the  next  largest  group,  and 
the  lowest  correlation  from  the  smallest  group.  The  average 
correlation,  with  weighting  for  size  of  group,  is  +.882 ±.01. 
While  it  is  a  fact  that  the  larger  group  and  the  higher  correlation 
go  together,  this  fact  should  not  be  taken  to  mean  more  than  it 
actually  does.  We  know  in  general  that  the  smaller  the  group  the 
more  effective  becomes  the  influence  of  error  and  that  findings 
for  large  groups  are  usually  more  dependable  than  those  for  small. 


Tovra  A 
TownB 
TownC 
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The  average  correlation  of  +.882  ±.01  shows  that  there  is  a 
high  degree  of  resemblance  between  the  findings  of  one  group  and 
those  of  the  other.  In  other  words,  if  the  correlation  had  been 
zero,  then  there  would  have  been  nothing  but  a  chance  agreement 
as  to  who  was  the  good  teacher  and  who  was  the  poor  teacher. 
If  there  had  been  a  correlation  of  —1.00,  it  would  have  implied 
that  a  teacher  who  was  highly  esteemed  by  one  group  would  have 
been  thought  meanly  of  by  the  other  group.  If  there  had  been  a 
correlation  of  +1.0,  it  would  have  implied  that  there  was  per- 
fect agreement  between  the  two  groups  in  their  estimates  of 
teachers. 

In  a  range  of  —  1 .0  to  + 1 .0,  an  average  correlation  of  +  .899  =±=  .01 
is  seen  to  mean  an  amount  of  agreement  that  is,  indeed,  very 
significant.  This  inner  consistency  may  be  taken  to  connote 
that  teachers'  estimates  of  each  other  are  by  no  means  a  hit-and- 
miss  aflfair,  but  that  there  is  a  practical  unanimity  of  opinion 
concerning  teaching  ability.  Errors  in  judgment  tend  to  lower  a 
correlation  and  there  could  not  be  very  much  of  chance  guessing 
in  a  set  of  judgments  which  correlate  with  a  similar  set  as  highly 
as +.899,  ±.01. 

AGREEMENT  BETWEEN  TWO  GROUPS  OF  TEACHERS  AND  THE  SUPER- 
VISORS' JUDGMENTS  WHO  JUDGE  THE  SAME  TEACHERS  FOR 
GENERAL    TEACHING    ABILITY 

In  the  Town  A  judgments,  the  supervisory  force  consisted 
of  the  superintendent,  the  principals  who  spent  full  time  in 
supervision,  the  supervisors  of  music,  drawing,  and  physical 
education,  health  officers,  and  two  school-board  members  who 
were  especially  well  informed  concerning  the  teachers.  In  Towns 
B  and  C  the  judgment  of  the  superintendent  alone  was  available. 
There  is  every  reason,  however,  to  assume  that  these  judgments 
were  of  a  high  order. 

The  second  method  of  making  an  estimate  of  teaching  ability 
was  to  have  the  supervisors  rate  each  teacher.  This  was  done  by 
the  relative-position  method.  The  statistical  procedure  was 
similar  to  that  which  was  used  in  deriving  the  ranking  of  teach- 
ers from  the  summation  of  the  ratings  of  the  teachers.^ 


^  These  ratings  are  reported  in  full  in  the  data  sheets  which  are  filed  at 
Teachers  College,  Columbia  University. 
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By  correlation  we  find  the  following  agreement  between  the 
ratings  of  teachers  of  each  other  and  corresponding  ratings  by 
the  supervisors: 

r  between  Super-  r  between  Super- 
visors and  Group  visors  and  Group 
A  Teachers  for  B  Teachers  for 
General  Teaching  General  Teaching 
Ability  Ability 

Town  A  (  ^'■^'^^  ^^^^^^^'^ +.934,  ±.01  +.999,  ±.00 

1  High-school  teachers + .  930,  ± .  03  + .  606,  ± .  16 

Town  B  /  ^'■^^^  ^^^^^^^^ +.976,  ±.00  +.976,  ±.00 

)  High-school  teachers + .  972,  ± .  01  + .  833,  ± .  08 

rpQ.,^  Q  f  Grade  teachers +.959,  ±.01  +.913,  ±.03 

\  High-school  teachers + .  912,  ± .  05  + .  761,  ± .  13 

Average +.974,  ±.00  +.951,  ±.00 

Averaging  the  correlations  +.974  and  +.951,  for  the  correla- 
tion between  the  judgments  of  supervisors  and  teachers  in  their 
rating  for  general  teaching  ability,  we  get  the  coefficient  of  corre- 
lation +  .962.  These  figures  may  mean  that  the  teachers  judge 
as  they  do,  because  they  know  in  a  general  way  what  the  super- 
visors think  and  therefore  make  their  ratings  agree  as  far  as  they 
can  with  those  which  they  think  the  supervisors  will  give.  Or 
they  may  mean  just  the  opposite.  More  reasonable  is  the 
opinion  that  teachers  and  supervisors  alike  have  access  to  the 
same  information  and  therefore  form  similar  judgments  from 
a  consideration  of  similar  data. 

Whatever  may  be  the  explanation  of  the  high  correlation  be- 
tween the  judgments  of  the  teachers  themselves  and  their  super- 
visory officers,  the  important  thing  is  that  the  correlation  is  high. 
If  supervisors  can  form  a  fair  ranking  of  teachers,  then  teachers 
can  rank  themselves,  as  is  shown  by  the  high  correlation  (+ .  962) 
between  the  Group  A  plus  Group  B  teachers  and  the  supervisors 
who  judged  for  general  teaching  ability. 


AGREEMENT  BETWEEN  PUPILS  JUDGMENTS  OF  TEACHERS  ANI> 
OF  TEACHERS  AND  SUPERVISORS  WHO  JUDGE  THE  SAME 
TEACHERS 

There  was  one  group  of  pupils  (nearly  200),  in  the  grades  of 
Town  A,  who  were  receiving  instruction  from  eleven  different 
teachers.  These  pupils  rated  their  teachers  under  dignified  and 
respectful  circumstances.     The  exact  method  will  be  explained 
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in  a  later  connection.  The  correlation  between  the  scale  values 
which  these  eleven  teachers  received  in  the  rating  by  mutual 
judgments  and  the  pupils'  ratings  was  +.681.  The  high-school 
faculties  of  Towns  A  and  C  were  similarly  rated,  with  the  result- 
ing coefficients  of  correlations  between  mutual  judgment  rating 
and  pupils'  ratings  of  + .  807  for  Town  A  high  school,  and  + .  684 
for  Town  C  high  school. 

Dividing  the  pupils'  ratings  into  two  chance  groups  (A  and  B) 
and  correlating  with  the  supervisors'  estimates,  we  get  the  fol- 
lowing results: 

r  Group  A  Pupils'     r  Group  B  Pupils' 
Estimates  and  Su-    Estimates  and  Su- 
pervisors' Esti-  pervisors'  Esti- 
mates mates 

_,  .    f  Grade  teachers +.875  +.656 

own  A  j  jjigj^.ggijooi  teachers + .  682  + .  730 

Town  C,  High-school  teachers + .  631  + .  738 

Note. — The  pupils'  ranking  of  teachers  in  the  grades  of  Town  C  could  not 
be  obtained.  The  grades  were  not  departmentalized  and  hence  pupils  were 
acquainted  with  too  few  teachers. 

From  three  separate  sources,  scales,  in  units  of  amount,  for  the 
general  teaching  ability  of  the  teachers,  have  been  obtained. 
The  correlations  are  of  such  a  nature  that  one  is  warranted  in 
assuming  that  the  ratings  which  have  been  given  by  either  group 
of  the  teachers  themselves  are  dependable  ratings  of  general 
teaching  ability. 


CHAPTER  III 

MEASURABLE  FACTS  RELATED  TO  GENERAL 
TEACHING  ABILITY 

We  have  now  a  rating  for  general  teaching  abihty  for  156  teach- 
ers. The  data  which  show  the  correlation  between  success  in 
teaching  and  certain  measurable  facts  concerning  teachers  will 
now  be  presented.  As  a  measure  of  teaching  ability  both  the 
teachers'  mutual  ratings  of  Group  A  and  the  supervisors'  ratings 
will  be  used.  In  all  instances  the  correlations  are  computed  by 
the  Pearson  formula. 

THE    SIGNIFICANCE    OF   AGE 

The  coefficients  of  correlation  between  teaching  ability  and  age 
for  each  of  the  six  groups  of  teachers  have  been  computed.  The 
age  at  the  last  birthday  has  been  taken.  Fractional  parts  of  a 
year  have  not  been  used.  It  has  not  been  possible  to  check  the 
correctness  of  the  ages  of  all  the  teachers,  but  those  teachers  who 
were  members  of  the  State  Pension  Fund,  and  practically  all  the 
teachers  were  members,  were  checked  from  the  affidavits  which 
are  on  file  at  the  office  of  the  Fund.     The  ages  were  of  April,  1919. 

Typical  examples  of  distributions,  for  Town  A,  grade  teachers, 
are  inserted  on  the  following  page. 

It  would  seem  that  a  teacher's  age  is  not  a  very  good  index  of 
her  general  teaching  ability.  This  same  negligible  and  often 
negative  correlation  has  been  found  in  other  studies  even  when  a 
more  lenient  method  of  determining  teaching  ability  has  been 
used. 

There  is  one  factor,  however,  which  may  have  some  effect  on 
the  correlation.  In  Town  A  and  in  Town  B  no  teacher  is  at  pres- 
ent employed  who  has  not  had  elsewhere  two  years  of  successful 
experience.  The  rule  has  been  in  force  for  two  years.  On  ac- 
count of  the  exclusion,  for  even  this  short  time,  of  very  young 
teachers,  the  correlation  may  have  been  affected  as  it  would  not 
have  been  affected  in  other  school  systems. 

The  coefficients  of  correlation,  which  have  been  given  above, 
should  not  be  carelessly  taken  to  mean  that  there  is  no  relation- 
ship between  general  teaching  ability  and  age.     Obviously,  a 
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child  could  not  teach.  Excessive  old  age,  on  the  other  hand,  is 
not  a  negligible  factor  in  determining  general  teaching  ability. 
Within  the  limits  of  ages  at  which  people  actually  do  teach,  age 
appears  to  be  an  irrelevant  factor. 

1.  Distribution  by  Ages  in  Years:  No.  of  Teachers 

25  or  under 10 

25  to  30 18 


30  to  35.. 
35  to  40.. 
40  to  45 .  . 
45  to  50.. 
50  to  55 .  . 
55  to  60.. 
60  or  over . 


2.  Distribution  by  Years  of  Experience: 

Less  tiian  5 11 

5  to  10 20 


10  to  15 . 
15  to  20. 
20  to  25 . 
25  to  30. 
Over  30. 


These  are  grouped  distributions.  In  computing  the  correlations,  the  actual 
fact  was  used  in  each  case.  Grouping  of  this  kind,  however,  is  sufficient  to 
show  the  facts  of  distribution. 

3.  Distribution  by  Amount  of  Professional  Study  while  in  Service: 

No  professional  study 33 

Professional  study  equivalent: 

To  one  summer-school  session  of  work  in  education 12 

To  two  summer-school  sessions  of  work 6 

To  three  summer-school  sessions  of  work 1 


r  of  Age  with 
Group  A  Rating 
of  Teachers'  Judg- 
ments for  General 
Teaching    Ability 

Grade  teachers + .  191 

High-school  teachers —.151 

Grade  teachers + .  050 

High-school  teachers +.  525 

Grade  teachers — .050 

High-school  teachers +.  604 

Average  (weighted  for  number  in  each 

group) +.135±.07 


Town  A 


TownB 


TownC 


r  of  Age  with 
Supervisors'  Esti- 
mates of  General 
Teaching    Ability 

-f  .047 

—  .001 

—  .100 
-f  .422 
—.108 
-1-.335 

-f.0298±.07 
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THE    SIGNIFICANCE    OF    EXPERIENCE 

The  factor  of  experience  has  been  studied  in  two  ways — (1) 
total  experience  in  teaching,  wherever  that  experience  has  been 
gained,  and  (2)  the  experience  gained  in  the  present  school  sys- 
tem or  position.  No  significant  mutual  relationship  appeared. 
These  correlations  may  be  affected  by  the  fact  that  teachers  who 
are  fresh  from  the  normal  schools  are  not  engaged.  In  these 
data  experience  as  such  mattered  little.  While  it  is  clear  that  a 
teacher  as  she  becomes  older  does  not  necessarily  become  better 
by  any  process  of  inner  growth,  it  is  also  clear  that  the  older 
teacher  is  not  necessarily  the  poorer  one.  As  far  as  these  data 
reveal  the  true  situation,  neither  amount  of  experience  nor  age 
should  be  considered  factors  of  large  significance  in  the  assurance- 
of  teaching  success. 

r  between  Total  r  between  Totat 
Experience  and  Experience  and 
General  Teaching  General  Teaching 
Ability  as  Deter-  Ability  as  Deter- 
mined by  Group  mined  by  Super- 
A,  Teachers'  Judg-  visors'  Judgments 
ments 

„  .    f  Grade  teachers + .  018  + .  102 

1  own  A  <  jjigjj.g(.jjooi  teachers - .  079  —.102 

_,         p,  f  Grade  teachers + .  140  + .  135 

i  own  B  <  High-school  teachers + .  531  + .  422 

„         p  f  Grade  teachers — .249  +.135 

\  High-school  teacher — .  180  + .  340 

Average  (weighted  for  number  in  each 

group) —.0386,  ±.11  4-.  140,  ±.10 

r  between  Local  r  between  Local 
Experience  and  Experience  and 
General  Teaching  General  Teaching 
Ability  as  Deter-  Ability  as  Deter- 
mined by  Group  mined  by  Super- 
A,  Teachers'  Judg-  visors'  Judgments 
ments 

_  .    f  Grade  teachers +.047  — .015 

1  High-school  teachers — .  079  —.066 

f  Grade  teachers +.144  +.089 

1  High-school  teachers + .  364  + .  250 

_         p  f  Grade  teachers — .  148  — .275 

\  High-school  teachers + .  510  + .  416 

Average  (weighted  for  number  in  each 

group) +.124,  ±.73  +.137±.7 

3 
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If  there  were  little  difference  in  the  amounts  of  experience  that 
the  teachers  possessed,  then  the  coefficients  of  correlation  would 
be  low,  not  because  experience  was  not  a  factor  in  determining 
general  teaching  ability,  but  because  all  teachers  had  the  same 
amount  of  it.  The  amounts  of  experience,  of  age,  and  of  pro- 
fessional study  in  the  case  of  the  grade  teachers  of  Town  A  have 
already  been  shown  to  illustrate  what  the  differences  in  experience 
actually  were. 

The  same  is  true  of  salary,  age,  or  any  other  factors.  In  short, 
if  series  A  and  series  B  are  to  be  correlated,  no  distribution  in 
either  A  or  B  would  mean  a  zero  correlation.  But  in  all  of  our 
series  distributions  do  occur.  The  low  correlations  cannot  be 
accounted  for  by  absence  of  distribution. 


CORRELATION    OF   TEACHING   ABILITY   AND    SALARY    RECEIVED 

At  the  time  that  this  study  was  made  there  was  no  adequate 
salary  schedule  operative  in  Town  A.  Although  there  were  some 
salary  differences,  many  factors,  other  than  those  of  services 
rendered,  were  effective  in  determining  the  salaries  which  were 
paid.  In  Town  B  and  Town  C,  however,  salary  schedules,  based 
on  merit,  were  already  well  started.  The  correlations,  therefore, 
are  of  particular  interest. 

r  between  Salary  r  between  Salary 
and  General  Teach-  and  General  Teach- 
ing Ability  as  Deter-  ing  Ability  as  Deter- 
mined by  Group  mined  by  Super- 
A,  Teachers'  Judg-  visors'  Judgments 
ments 

J  Grade  teachers Not  computed  Not  computed 

Town  A  <;  jjigi^.gpj^ooi  teachers Not  computed  Not  computed 

j  Grade  teachers +  .359  +  AlO 

Town  B  <  jjigi^.gphool  teachers + .  130  + .  089 

„  J  Grade  teachers +  .575  +  .083 

Town  O  j  jjigh.school  teachers + .  676  + .  515 

Average    (weighted  for  number    in 

each  group)! -H. 434,  ±.08  +.263,  db.09 

It  will  be  seen  that  the  coefficients  of  correlations,  although 
they  are  not  high,  are  at  least  positive,  even  in  the  judgments  of 
the  teachers  themselves.  In  view  of  the  fact  that  men  are  paid 
higher  than  are  women,  though  not  primarily  for  their  better 

1  Town  A  not  counted;  others  weighted  for  size  of  groups. 
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service,  but  because  of  their  sex,  it  may  be  a  fact  that  a  few 
men  in  the  grades  are  receiving  relatively  high  salaries,  although 
they  have  only  moderate  ability.  If  this  be  so,  then  the  coeffi- 
cients of  correlation  will  be  somewhat  lowered.  Perhaps  we 
should  then  more  properly  compute  the  correlations  for  women 
only.  In  neither  Town  B  nor  Town  C  was  this  the  case,  for  there 
was  only  one  man  involved,  and  he  was  rated  high  and  paid  well. 
I  mention  this  possible  condition,  to  caution  those  who  are  making 
similar  studies  in  which  important  sex  differences  occur. 

In  the  case  of  the  high-school  teachers,  however,  a  distinction 
of  sex  should  be  made  in  order  to  eradicate  sex  as  a  factor — and 
not  ability  as  a  factor — in  the  salaries  which  teachers  receive. 

THE  RELATION  BETWEEN  GENERAL  TEACHING  ABILITY  AND  SCORES 
MADE    IN   TWO    PSYCHOLOGICAL   TESTS  ^ 

Approximately  one  hundred  teachers  were  given  psychological 
tests. 

The  first  test  might  be  called  a  test  of  mental  alertness.  It  has 
been  used  sufficiently  in  many  other  connections  to  warrant  the 
placing  of  considerable  confidence  in  it.  This  test  was  divided 
into  two  parts.  Each  part  lasted  somewhat  over  thirty  minutes. 
Before  the  test  was  given,  the  teachers  had  an  opportunity  to 
look  over  a  similar  test  so  that  unfamiliarity  with  the  material 
would  not  be  a  handicap.  The  scoring  and  the  methods  of  com- 
puting final  ratings  are  standardized. 

r  between  General  r  between  General 
Teaching  Ability  Teaching  AbiUty 
as  Determined  by  as  Determined  by 
Group  A  Teachers'  Supervisors'  Es- 
Judgments  and  timates  and  Men- 
Mental  Alertness  tal  Alertness 

_,  .    r  Grade  teachers — .099  + .  115 

°^^      1  High-school  teachers + .  381  -}- .  346 

„  I  Grade  teachers +  .306  -|-.230 

\  High-school  teachers +.  545  -|- .  484 

A  second  intelligence  test  was  given  to  the  grade  teachers  and 
high-school  teachers  of  Town  A.  The  r  gained  from  the  second 
tests  were: 

f  Grade  teachers -F.060  +.179 

1  own  A  j  2igh_g(,hool  teachers -h .  480  + .  648 

1  The  tests  used  here  are  known  as  the  first  section  of  Thorndike  College 
Entrance  Examination. 
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The  correlation  between  the  two  tests  was  +.812.  It  shows 
to  what  extent  the  same  abihty  was  measured  in  both  tests.  The 
correlation  between  general  teaching  ability  in  the  elementary 
grades,  as  estimated  by  teachers  who  served  as  judges,  and  intel- 
lect, as  measured  by  this  test,  is  +.173,  ±  .10.  The  correlation 
between  general  teaching  ability,  as  estimated  by  supervisors 
who  served  as  judges,  and  intellect,  as  measured  by  psychological 
tests,  is  + .  156,  ±  .  10. 

The  correlations  are  distinctively  higher  in  the  case  of  high- 
school  teachers.  The  mutual  relationship  between  intellect  and 
teaching  ability,  as  measured  by  teachers'  estimates,  is  + .  446, 
±  .  16.  When  general  teaching  ability  is  estimated  by  the  super- 
visors the  correlation  is  +.410, ±.16.  These  correlations  are 
averages  which  have  been  weighted  for  the  size  of  teacher  groups. 

By  using  these  two  tests  of  separate  measures  of  intellect  and 
by  using  the  Group  A  mutual  judgments  and  the  supervisors' 
estimates  as  two  separate  measures  of  teaching  ability,  we  can 
correct  for  attenuation,  and  get  a  final  correlation  of  + .  57  be- 
tween general  teaching  ability  and  scores  which  have  been  made 
in  psychological  tests,  as  in  the  case  of  high-school  teachers. 

The  practically  zero  correlation  between  the  teaching  ability 
of  grade  teachers  and  mental  alertness,  as  measured  by  test,  does 
not  mean  that  intellect  is  an  irrelevant  factor  in  teaching.  For 
there  is  no  occupation  in  which  intellect  is  not  to  some  extent 
useful.  Even  a  man  with  a  pick  can  use  his  intellect  to  advantage 
in  deciding  where  best  to  grasp  the  handle  of  the  pick,  in  deter- 
mining the  distance  which  one  foot  should  be  ahead  of  the  other, 
and  in  arriving  at  other  conclusions.  Intellectual  differences, 
however,  among  those  who  use  the  pick  are  not  as  significant  as 
they  would  be  among  surgeons,  philosophers,  or  psychologists. 
Although  brains  are  of  use  in  picking,  it  is  also  true  that  physical 
strength,  lung  expansion,  large  nasal  passages,  and  other  factors 
are  relatively  of  much  greater  importance  than  intellect. 

In  elementary-school  teaching,  even  in  the  most  routine  work, 
intellect  can  be  used  and  is  used,  but  patience,  industry,  sym- 
pathy, and  other  qualities  are  relatively  of  greater  importance 
than  intellect.  The  differences  in  intellect  among  teachers  are, 
as  it  were,  lost  in  the  complexities  of  differences  in  the  amount  of 
many  other  traits  which  are  also  important  in  elementary-school 
teaching. 
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For  high-school  teachers  this  is  not  so  correspondingly  true. 
There  does  appear  to  be  some  relationship  between  differences  in 
intellect  and  differences  in  teaching  ability.  High-school  pupils 
are  more  mature;  the  content  of  high-school  subjects  is  less  under 
the  spell  of  method  than  it  is  in  the  elementary-school  subjects. 
Therefore,  sheer  intellectual  ability  does  operate  in  a  way  that 
it  does  not  seem  to  operate  in  the  elementary-school  teaching. 
We  have  too  few  cases,  however,  from  which  to  generalize. 

THE  SIGNIFICANCE  OF  ABILITY  TO  PASS   A   PROFESSIONAL   TEST    IN 
RELATION  TO  GENERAL  TEACHING  ABILITY 

Tests  of  a  professional  nature  are  often  used  as  a  means  of 
determining  a  candidate's  fitness  for  election  or  promotion.  It 
was,  therefore,  entirely  within  our  province  to  determine,  as  ac- 
curately as  possible,  the  correlation  between  the  ability  to  pass 
a  professional  test  and  the  ability  to  teach.  An  examination,^ 
which  called  more  or  less  definitely  for  knowledge  of  the  technique 
of  teaching,  was  given.    The  time  allowed  was  seventy  minutes. 

While  no  objective  means  for  correction ^  were  obviously  avail- 
able, due  care  in  the  correction  work  was  taken.  The  names  of 
the  teachers  were  not  written  on  the  papers  until  after  the  cor- 
rections were  made.  By  this  method,  any  weakness  on  the  part 
of  the  examiner,  to  favor  some  papers  and  to  discriminate 
against  others,  was  avoided. 

For  grade  teachers  in  Town  A  the  r  between  ability  to  pass  a 
professional  test  and  teaching  ability  as  determined  by  teachers' 
ratings  was  -f-.450  (number  of  cases  33).  The  r  between  abihty 
to  pass  a  professional  test  and  teaching  abihty  as  rated  by  super- 
visors was  +.767  (number  of  cases  33). 

For  high-school  teachers  in  Town  A  the  r  between  ability  to 
pass  a  professional  test  and  abihty  to  teach  as  determined  by 
teachers'  ratings  was  +.147  (number  of  cases  7).  The  r  between 
ability  to  pass  a  professional  test  and  ability  to  teach  as  rated  by 
supervisors  was  +.001. 

It  is  unfortunate  that  more  cases  could  not  have  been  secured. 
From  the  evidence  which  we  have,  it  would  seem  that  knowledge 
which  is  required  to  pass  a  test  such  as  the  one  referred  to  above 

1  A  copy  of  the  examination  used  is  filed  with  original  data  at  Teachers  College, 
Columbia  University.  Revised  copies  known  as  "  ATradeTest  for  Elementary- 
School  Teachers"  by  Knight  and  Franzen  can  be  secured  from  the  writer. 

2  The  correction  of  the  examinations  was  made  by  the  writer. 
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is  not  necessary  for  a  person  who  wishes  to  be  successful  in  high- 
school  teaching. 

For  elementary-school  teaching  such  knowledge  is  much  more 
needed.  Variations  among  teachers  in  their  ability  to  pass  a 
test  such  as  the  one  referred  to  above  is  more  significant  than 
variations  in  their  age,  in  their  experience,  or  in  their  salary. 

These  data  strongly  suggest  the  practicability  (1)  of  selecting 
high-school  teachers  by  psychological  test  and  (2)  of  selecting 
elementary-school  teachers  by  a  test  which  involves  a  knowledge 
of  the  technique  of  teaching. 

If  professional  tests  could  be  made  as  accurate  tests  of  technical 
knowledge  as  psychological  tests  are  made  tests  of  intellect,  then 
the  correlation  between  the  achieved  scores  and  success  in  ele- 
mentary-school teaching  might  well  be  measurably  increased. 

Further,  if  high-school  teachers  were  given  a  more  extended 
psychological  test,  let  us  say  at  least  three  hours  instead  of  one, 
as  here  indicated,  then  the  results  might  be  even  more  indicative 
of  their  teaching  ability  as  a  qualification  for  work  in  the  high 
school. 

THE  SIGNIFICANCE  OF  PROFESSIONAL  STUDY  WHILE  IN  SERVICE  IN 
RELATION  TO  GENERAL  TEACHING  ABILITY 

Much  stress  has  lately  been  placed  upon  the  value  of  profes- 
sional study  while  in  service.  Many  school  administrators  place 
a  high  value  upon  summer-school  and  university  study  in  which 
teachers  may  engage.  In  many  cases  salary  adjustments  are 
made  in  part,  at  least,  upon  the  fact  that  a  specified  teacher  has 
taken  professional  courses  in  education. 

It  is  of  value  to  ascertain  what  effect  this  professional  study 
has  upon  the  general  teaching  ability  of  an  individual  teacher  or 
group  of  teachers.  We  know  in  the  cases  of  the  teachers  whom 
we  have  studied  closely,  how  these  teachers  stand  in  general  teach- 
ing merit  and  also  the  amount  of  professional  study  while  in 
service  which  they  have  to  their  credit. 

Is  it  the  case  that  those  teachers  who  are  studying  their  pro- 
fession are  also  the  teachers  who  stand  high  in  general  teaching 
merit?  Even  if  this  were  the  fact,  it  would  not  be  clear  just  what 
the  fact  might  mean.  For  example,  it  might  mean  that,  because 
a  teacher  studied,  she  gained  power  and  was,  therefore,  a  better 
teacher.     It  might  mean  that,  because  she  was  a  good  teacher, 
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she  was,  therefore,  deeply  interested  in  the  technique  of  teaching 
and  consequently  studied.  It  might  mean  that  the  motive  for 
doing  the  proper  thing  is  operative  and  that  those  teachers  take 
summer-school  work  who  are  most  easily  influenced  by  the  desire 
to  please  their  administrative  officers.  It  might  mean  that  the 
correlation  between  good  teaching  and  professional  study  is  more 
or  less  fictitious. 

It  is  also  true  that  many  teachers  would  like  to  study,  but,  for 
domestic  and  financial  reasons,  cannot  do  so.  The  teacher  may 
be  good  in  her  work  because  she  studies.  She  may  study,  how- 
ever, because  she  is  good  in  her  work.  On  the  other  hand,  whether 
she  studies  or  not  may  have  an  indifferent  relation  to  her  merit. 
Finally,  the  true  relation  between  professional  study  and  teaching 
ability  may  be  a  composite  of  all  these  possibilities,  which  is  prob- 
ably the  case.  Unfortunately,  for  this  consideration  my  data 
are  scant. 

In  Towns  B  and  C  so  few  teachers  had  done  any  organized 
study,  while  they  were  in  service,  that  no  relation  could  be  estab- 
lished in  these  school  systems  between  professional  study  and 
quality  of  service.  In  Town  A  enough  of  the  teachers  had  done 
summer-school  and  university  work  while  they  were  in  service  to 
make  the  study  worth  while. 

Six  weeks  of  professional  study  in  one  course  was  counted  as  the 
unit  of  measurement  of  professional  study.  Amounts  of  profes- 
sional study  are  not  as  good  measures  as  amounts  of  study  plus 
quality,  but  the  quality  of  professional  study  during  service  is  a 
fact  too  elusive  to  obtain.  Those  teachers  who  undertook  no  pro- 
fessional study  were,  of  course,  rated  as  having  done  a  zero  amount. 
The  correlations  follow : 

r  between  General  r  between  General 
Teaching  Ability  Teaching  Ability 
as  Determined  by  as  Determined  by 
Group  A  of  Teach-  Supervisors'  Es- 
ers'  Judgments  timates  and  Pro- 
and     Professional  fessional  Study 

Study 

./ Grade  teachers +.275  +.381 

1  own  A  j  High.school  teachers + .  422  +  •  364 

Number  of  cases  used  in  this  computation,  52. 

In  Town  A  no  teacher  had  been  forced  to  study.  While  some 
premium  had  been  placed  on  professional  study,  even  those  who 
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had  been  given  the  opportunity  of  professional  study  were  not 
chosen  from  among  the  ablest  teachers.  This  fact  would  tend  to 
lower  the  correlation.  We  cannot,  of  course,  tell  whether  some 
teachers  who  had  done  professional  study  would  have  been  dif- 
ferently rated  if  they  had  not  done  so.  We  cannot  tell,  on  the 
other  hand,  how  other  teachers  would  have  been  rated  had  they 
undertaken  professional  study.  In  view  of  the  presence  of  irrele- 
vant factors  which  tend  to  lower  the  correlation,  it  seems  fair  to 
say  that,  under  ideal  conditions  where  all  teachers  can  study,  if 
they  wish  to  do  so,  the  true  correlation  between  professional  study 
while  in  service  and  teaching  merit  will  be  no  lower  and,  in  all 
probability,  will  be  higher,  than  the  correlation  which  we  have 
obtained. 

The  effect  of  professional  study  is  as  yet  not  clear.  The  fact 
of  a  positive  though  small  correlation,  in  spite  of  factors  which 
tend  to  lower  it,  seems  to  justify  the  use  of  such  a  factor  as 
the  amount  of  professional  study  for  diagnostic  and  prognostic 
purposes. 

IS    QUALITY    OF    PENMANSHIP    AN    INDEX    OF    TEACHING    ABILITY? 

In  the  professional  test  (see  page  27)  there  was  a  copy  of  a 
letter,  uncapitalized  and  unpunctuated,  which  was  to  be  copied. 
This,  of  course,  would  be  interpreted  by  any  teacher  who  took 
the  examination  as  an  exercise  in  punctuation  and  capitalization 
— and  such  it  was.  It  might  also  be  used  as  a  very  convenient 
test  of  the  quality  of  a  teacher's  handwriting.  We  get  a  much 
truer  picture  of  how  teachers  write  under  working  conditions,  by 
using  material  of  this  kind,  than  we  can  get  if  we  merely  asked 
teachers  to  furnish  a  specimen  of  their  handwriting. 

This  material,  used  as  an  index  of  the  handwriting  ability  of 
teachers,  was  scored  for  legibility  by  using  the  Thorndike  scale, 
and  quantitative  values  were  assigned  to  the  specimens  of  hand- 
writing. The  scoring  was  made  with  the  scorer  ignorant  of  the 
names  of  the  persons  who  wrote  the  specimens.  The  correla- 
tions between  the  legibility  of  the  teachers'  handwriting,  as  ex- 
pressed in  terms  of  amount  of  legibility,  and  their  general  teach- 
ing ability  rating  follow: 
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r  between  General  r  between  General 
Teaching  Ability  Ability  as  Deter- 
as  Determined  by  mined  by  Super- 
Group  A  of  Teach-  visors'  Estimates 
ers'  Judgments  and  Legibility  of 
and  Legibility  of  Handwriting 
Handwriting 
Town  A,  Grade  teachers + .  001  + .  012 

That  legibility  of  penmanship  is  no  index  of  teaching  ability 
seems  clear.  It  should  be  added,  however,  that  the  variations  in 
the  legibility  of  the  handwriting  were  small.  Most  of  the  teach- 
ers are  so-called  "Palmer  handwriting  certificate  holders."  The 
restricted  spread  in  differences  in  handwriting  ability  is  not 
needed  to  explain  the  zero  correlation.  As  a  matter  of  common 
sense  there  is  no  causal  relation  between  handwriting  and  ability 
to  teach. 

THE    MUTUAL    RELATION    BETWEEN    GENERAL    TEACHING    ABILITY 
AND    NORMAL-SCHOOL    SUCCESS 

The  relationship  between  general  teaching  ability  and  normal- 
school  success  has  been  obtained  in  two  ways.  First,  a  study  was 
made  of  the  relation  between  those  teachers  who  came  from  the 
same  normal  school  and  those  teachers  who  now  teach  in  the 
same  group.  By  this  rigorous  requirement  errors  have  been 
eliminated  which  would  exist  if  comparisons  were  made  with  the 
records  of  teachers  who  came  from  different  normal  schools,  or 
if  comparisons  were  made  with  the  records  for  teaching  ability 
which  would  come  from  equating  the  relative  merits  of  teachers 
who  work  in  different  systems. 

It  is  exceedingly  difficult  to  get,  for  any  considerable  number  of 
teachers,  accurate  measures  of  the  teachers'  standings  in  normal 
schools  and  of  the  success  in  teaching  of  the  same  group.  For, 
upon  leaving  the  normal  schools,  the  graduates  scatter.  In  any 
school  system  there  are  teachers  who  come  from  so  many  different 
schools  and  at  such  widely  varying  times  that  large  error  is  Hkely 
to  creep  into  any  investigation  of  the  relation  of  normal-school 
success  to  general  teaching  ability. 

While  the  procedure  adopted  in  this  study  was  calculated  to 
reduce  error  and  doubtless  did  reduce  the  error,  it  also  reduced 
the  number  of  cases.    The  correlations,  which  are  positive,  are 
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for  grade  teachers,  +.147  (19  cases);  for  high-school  teachers 
with  college  records,  Town  B,  +.600  (6  cases). 

The  standing  in  normal  school  or  college  was  determined  by  a 
complex  process.  The  grades  which  determined  the  total  stand- 
ing were  all  added  together  and  divided  by  the  number  of  grades. 
All  grades  were  not  counted  as  having  equal  value.  Thus  a  grade 
"A"  in  English  was  not  counted  as  the  equal  of  a  grade  "A"  in 
history.  The  values  of  the  grades  "A,"  "B,"  "C,"  ''D,"  "E" 
of  each  study  were  determined  by  taking  all  the  grades  of  each 
study  and  by  computing  the  percentage  of  each  grade  of  the 
total.  Then  a  probability  curve  for  them  of  form  "A"  was  as- 
sumed. A  computation  of  their  value  in  terms  of  the  standard 
deviation  (S.D.)  distance  from  the  mean  was  made.  Thus  the 
inequalities  of  grading  in  each  department  were  to  some  extent 
at  least  neutralized.  As  an  illustration,  let  us  consider  the  194 
grades  in  English,  which  were  distributed  as  follows:  11  or  5 
per  cent  were  "F";  20  or  10  per  cent  were  "E";  27  or  19  per  cent 
were  "D";  74  or  38  per  cent  were  "C";  49  or  25  per  cent  were 
"B";  3  or  1  per  cent  were  "A." 

Assuming  a  normal  distribution  of  ability  and  using  the  S.D. 
values  which  are  found  in  Thorndike's  Menial  and  Social  Measure- 
ments, we  assign  to  "E"  a  value  of  minus  20,  to  "D"  a  value  of 
minus  12,  and  so  on.  The  values  of  the  several  grades  in  the  other 
subjects  were  similarly  determined.  The  grades  received  in 
practice  teaching,  English,  arithmetic,  history,  science,  and 
method  were  used  in  the  computation.  We  do  not  know  the  quan- 
titative value  of  "F"  in  terms  of  "E"  or  "D."  We  cannot  say 
that  "  B  "  is  twice  as  good  as  "  E,"  etc.  These  marks  can  be  used, 
however,  in  denoting  relative  positions  in  a  group  or  series. 
These  relative  positions,  in  turn,  can  be  translated^  into  terms  of 
amount.  The  correlation  was  obtained  between  teaching  ability 
and  normal-school  standing  for  such  persons  only  as  were  teach- 
ing in  the  same  group  and  came  from  the  same  normal  school. 
This  rigorous  method  of  selecting  data  reduced  the  number  of 
available  cases  to  19. 

This  correlation  of  teaching  ability  with  normal-school  success 
or  standing  is  dependable,  because  it  uses  a  very  accurate  rating 
of  teaching  ability.    The  method  of  its  determination  has  elim- 


*  For  the  process  see  Thorndike's  Mental  and  Social  Measurements,  Table  54, 
p.  221. 
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inated  errors  in  obtaining  normal-school  standing  by  taking  only 
those  teachers  from  the  same  normal  school  and  by  evaluating 
the  marks  which  the  teachers  received  in  different  studies.  Vari- 
ations in  the  meaning  of  marks,  however,  may  occur  in  one  sub- 
ject from  year  to  year  as  they  do  from  subject  to  subject. 
Whether  marks  actually  do  or  do  not  vary  is  a  matter  of  conjec- 
ture. The  real  weakness  in  this  correlation  is  due  to  the  small 
number  of  cases.  Since  a  correlation  using  less  than  50  cases 
lacks  numerical  strength,  the  correlation  between  normal-school 
standing  and  teaching  ability  has  been  computed  in  still  another 
way.    The  following  assumptions  have  been  made: 

1.  It  is  assumed  that  the  median  teacher  in  one  system  is,  to 
all  intents  and  purposes,  equal  in  teaching  ability  to  the  median 
teacher  in  the  other  two  systems,  and  that  the  summation  of  the 
quantitative  variations  from  the  median  in  one  system  is  equiva- 
lent to  the  summation  of  the  variations  in  either  of  the  other 
two  systems.  While  this  is  an  assumption,  the  correctness  of 
which  cannot  be  proved,  it  is  reasonable.  The  three  systems 
which  have  been  studied  are  all  within  metropolitan  Boston;  they 
draw  teachers  from  the  same  normal-school  systems;  they  pay 
about  the  same  salaries;  they  fit  pupils  for  the  same  colleges; 
and  they  are  not  widely  dissimilar  in  size.  It  is  fair  to  assume 
that  the  teaching  forces  are  about  the  same. 

2.  It  is  assumed  that  the  marks  in  any  one  subject  in  a  normal 
school  mean  about  the  same  as  they  do  in  any  other  subject  in 
that  school.  This  was  the  fact  when  the  values  of  marks  which 
were  given  in  the  several  subjects  by  the  Salem  Normal  School 
were  computed  for  the  first  correlation.  If  the  values  of  the 
marks  varied  to  some  degree,  the  final  result  would  not  be  seri- 
ously affected,  for  the  variations  would  be  in  all  directions  and 
would  have  the  same  effect  as  chance  errors. 

3.  It  is  assumed  that,  while  the  individual  marks  in  one  normal 
school  do  not  mean  the  same  as  they  do  in  another,  the  composite 
marks  are  comparable.  That  is,  if  we  find  that  all  the  teachers 
whom  we  studied  came  from  the  Salem  Normal  School,  then  a 
certain  teacher  is  the  median  in  normal-school  standing  for  that 
group  of  teachers  and  her  standing  is,  for  our  purpose,  the  same 
as  the  standing  of  the  median  teacher  from  any  other  normal- 
school  group  that  we  may  study.  It  is  necessary  to  make  this 
assumption  in  order  to  get  enough  cases.     It  is  not  an  unusual 
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assumption,  although  objections  to  it  are  perfectly  allowable  on 
mathematical  or  theoretical  grounds. 

The  normal  schools  which  are  studied  are  all  in  the  same  part 
of  the  country;  they  pay  about  equal  salaries  to  their  faculties; 
they  have  similar  courses  of  study;  they  are  in  practically  equal 
repute;  they  draw  about  the  same  class  of  pupils;  they  are  super- 
vised, with  one  exception,  from  the  same  office;  and  they  have 
such  professional  relations  among  themselves  that  their  ideals  and 
standards  are  largely  mutual.  Their  graduates  are  attracted  to 
the  same  school  system  and  are  equally  well  thought  of.  It  seems 
reasonable  to  assume  a  practical  identity  of  work  required.  This 
assumption  is  generally  made  in  practical  school  administration. 
It  is  admitted  it  has  statistical  shortcomings,  but  its  validity  for 
this  purpose  may  be  allowed.  Working  upon  these  assumptions, 
the  writer  has  computed  the  correlation  between  the  normal-school 
standing  of  53  teachers  and  their  success  in  teaching  out  in  the 
field,  which  is  +.333.  The  ratings  given  to  teachers  by  their 
fellow-teachers  were  used  as  the  quantity  to  represent  teaching 
ability.  The  normal-school  standing  was  the  numerical  value  of 
the  average  grade  received. 

THE    VALUE    OF   PUPILS'    ESTIMATES    OF   TEACHERS 

In  school  administration  we  have  never  taken  into  account  the 
fact  that  the  estimates  of  pupils  of  their  teachers  might  be  valu- 
able. The  importance  of  having  content  to  which  pupils  would 
respond  and  of  having  methods  to  which  pupils  would  favorably 
react  has  been  repeatedly  discussed,  but  we  have  assumed,  on  the 
whole,  that  pupils'  judgments  of  their  teachers  were  either  unob- 
tainable or  useless. 

We  may  yet  find  that  there  is  a  closer  relationship  between 
pupils'  success  in  school  and  their  reaction  to  the  teacher  than 
there  is  between  their  success  and  the  methods  of  teaching  read- 
ing, or  the  size  of  print  in  the  text-books,  or  the  amount  of  play 
space,  or  any  other  so-called  important  factor  of  school  manage- 
ment. 

Pupils  may  be  as  competent  judges  of  good  teaching  as  anyone 
else.  They  are  certainly  the  most  concerned.  Data  will  show 
that  it  is  not  the  poor  teacher  in  the  eyes  of  the  supervisor  who  is 
the  good  teacher  in  the  eyes  of  the  pupils. 

The  estimates  of  the  pupils  were  obtained  by  asking  the  pupils 
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to  write  on  a  sheet  of  paper  the  names  of  all  the  teachers  whom 
they  had  ever  had.  Then  it  was  explained  to  them  that  they  were 
going  to  say  which  of  all  these  teachers,  all  things  considered,  was 
the  best.  Reasonable  precautions  were  taken  in  giving  the  di- 
rections to  have  the  pupils  understand  the  meaning  of  best  and 
the  importance  of  making  the  most  deliberate  ratings  that  they 
could.  In  all  cases  the  pupils  were  told  not  to  write  their  own 
names  and  not  to  hurry  in  their  answers.  After  the  names  of 
teachers  whom  they  considered  best,  they  wrote  the  word  best. 
After  the  next  best  teacher,  they  wrote  the  word  next;  and  after 
the  third  best  teacher,  they  wrote  the  word  third.  In  this  manner 
the  following  groups  of  pupils  judged  their  teachers;  two  groups 
of  high-school  pupils,  one  of  seventh-grade  pupils  and  one  of 
eighth-grade  pupils.  To  offset  the  factor  of  forgetting  on  the 
part  of  the  pupils,  in  the  computation  only  those  teachers  whom 
the  pupils  were  having  at  the  time  of  their  making  the  judgments 
were  considered. 

The  two  high-school  groups  were  obtained  by  having  the  prin- 
cipal call  together  the  forty  most  dependable  pupils  in  the  school. 
The  elementary-school  group  was  composed  of  200  pupils  in  the 
departmentalized  grades  of  a  school  in  Town  A.  The  teachers 
who  were  judged  by  each  group  of  pupils  fell  into  three  groups — 
11,  15,  13 — or  sets  of  cases.  The  writer  is  certain  that  the  pupils 
responded  thoughtfully  to  his  request  for  their  judgments  and 
that  careful  opinion  was  expressed.  Each  group  of  pupils'  judg- 
ments was  then  divided  into  two  chance  groups  and  these  were 
treated  separately.  The  fact  that  the  correlations  between  these 
groups  were  -f .  767,  -f- .  517,  -f .  905  respectively  for  three  groups 
of  pupils  shows  that  factors  of  chance  were  not  operating  to  any 
great  degree. 

The  correlations  between  the  pupils'  estimates  of  teachers  and 
the  estimates  of  the  fellow-teachers  and  supervisors,  follow  on 
the  next  page. 

Using  the  two  halves  of  the  pupils'  estimates  as  two  independ- 
ent measures  of  the  pupils'  opinions  and  the  mutual  judgments  of 
teachers  and  the  supervisors'  estimates  as  two  independent  esti- 
mates, we  correct  these  correlations  for  attenuation.  The  cor- 
rected correlation  between  pupils'  estimates  and  adult  estimate 
of  teaching  abiUty  is  found  to  be  -|- .  784. 
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The  fact  that  the  correlation  between  groups  of  pupils'  judg- 
ments was  so  high  implies  a  real  relation  between  those  in  whom 
the  pupils  have  confidence  and  those  in  whom  supervisors  and 
fellow-teachers  have  confidence  for  their  teaching  ability.  The 
weakness  in  these  correlations  is  due  to  the  large  probable  error, 
which  is  due  to  the  small  number  (39)  of  teachers  who  are  studied. 


THE    SIGNIFICANCE    OF   INTERESTS 

The  relation,  if  any,  between  success  in  teaching  and  interests 
in  the  various  school  subjects  was  determined  by  having  the 
teachers  fill  out  blanks  which  were  constructed  to  reveal  the  rela- 
tive amount  of  interest  that  each  teacher  had  in  mathematics, 
history,  literature,  and  science. 

An  examination  of  the  data  clearly  shows  that  teachers  do  not 
have  distinct  types  of  interests.  The  better  teachers  showed  a 
slight  tendency  to  prefer  what  are  usually  considered  the  harder 
subjects. 


CHAPTER  IV 

THE  RELATIVE  SIGNIFICANCE  OF   THE   QUALITIES 

MEASURED 

We  have  obtained,  in  the  case  of  six  groups  of  teachers,  a  quan- 
titative rating  for  general  teaching  abihty  and  we  have  correlated 
success  in  teaching  with  certain  measurable  facts  about  teachers. 
The  number  of  cases  involved  was  153.  In  some  of  the  correla- 
tions the  total  number  of  cases  was  not  used. 

METHOD    USED 

The  process  of  obtaining  the  rating  for  general  teaching  ability- 
has  been  explained  and  the  process  by  which  certain  significant 
facts  about  the  teachers  were  obtained  has  been  explained. 

The  Pearson  formula  for  computing  the  coefficient  of  correla- 

^x  •  y 
tion :   r  =    / —   /-= — was  used  in  all  cases.     In  this  formula  x  is 

the  divergence  from  the  central  tendency  in  one  distribution  and 
y  is  the  corresponding  divergence  in  the  other.     The  weighted 
average  correlation  was  obtained  by  weighting  the  correlation  of 
each  group  on  the  basis  of  the  number  of  cases  in  that  group. 
The  formula  which  was  used  for  attenuation  follows: 


In  all  cases  reported  the  coefficients  of  unreliability  of  the  cor- 
relations have  been  computed.     The  formula  used  for  this  was: 

o  =i^' 

^tr—oht.T  /— 

Where  several  correlations  have  been  averaged,  the  weighting 
has  been  done  on  a  basis  of  the  number  of  cases.  If,  for  example, 
the  correlation  between  teaching  ability  and  age  was  +  .  444  for  a 
group  of  20  and  + .  666  for  a  group  of  40,  the  average  correlation 
would  be  +  •  592,  counting  the  20-group  once,  the  40-group  twice, 
and  then  dividing  by  three. 
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THE    MEANING    OF    CORRELATION 

The  correlations  vary  in  size  and  in  significance.  The  use  of 
correlation  in  this  connection  is  for  diagnostic  purposes.  For 
example,  if  we  knew  that  the  correlation  between  ability  to  pass 
an  intelligence  test  and  ability  to  teach  was  + .  999,  all  we  would 
have  to  know  about  a  teacher  would  be  her  ability  to  pass  an  intel- 
ligence test  in  order  to  know  how  good  a  teacher  she  would  be. 
Correlations  of  this  sort  do  not  exist. 

Teaching  is  not  perfectly  correlated  with  any  one  thing,  except 
teaching  ability.  Perfect  correlation  between  general  teaching 
ability  and  any  other  single  quality  would  mean  for  us  complete 
identity  between  the  two  traits  correlated,  and  teaching  is  not 
identically  like  any  one  quality  or  ability  which  we  can  as  yet 
measure.  We  cannot  be  sure  just  what  qualities  must  be  pos- 
sessed, or  in  what  degrees,  or  in  what  combinations,  for  a  teacher 
to  be  a  successful  teacher.  We  do  know  that  more  than  one 
quality  is  needed. 

In  all  probability,  we  shall  know  at  some  time  and  with  scien- 
tific precision  why  the  good  teacher  is  good  and  why  the  poor 
teacher  is  poor.  We  shall  also  possess  at  some  time  the  means  of 
securing  a  satisfactory  measure. 

We  are  all  certain  that  success  in  teaching  does  not  "just  hap- 
pen," but  is  due  to  the  possession  of  certain  traits  in  certain 
amounts  and  in  many  combinations.  Some  minimum  essentials 
can  be  stated,  but  at  present  we  are  not  certain  as  to  how  we  can 
use  the  knowledge  of  minimum  essentials  which  we  now  have. 

We  know,  of  course,  that  a  stark  idiot  could  not  teach;  but,  on 
the  other  hand,  we  do  not  know  how  much  intelligence  is  the  ideal 
amount  for  the  elementary  teacher  to  possess.  It  is  not  at  all 
certain  that  unusual  intellectual  attainments  in  a  first-grade 
teacher  are,  all  things  being  considered,  worth  paying  for.  It 
has  never  been  shown  that  a  teacher  with  an  intelligence  quotient 
of  180  is  a  better  teacher,  because  of  that  rating,  than  a  teacher 
with  an  intelligence  quotient  of  120.  It  is  well  within  reason  to 
suppose  that  too  much  intelligence  among  those  who  do  some 
kinds  of  teaching  work  is  a  handicap,  just  as  in  a  corresponding 
degree  too  little  intelligence  is  a  handicap  to  other  teachers. 

Similarly,  a  certain  amount  of  health  is  a  minimum  essential 
for  teaching,  but  it  has  never  been  shown  that  the  healthiest 


Relative  Significance  of  the  Qualities  Measured  39 

teachers  are  the  best  teachers.  After  a  certain  standard  of  health 
is  reached,  more  health  may  not  be  effective  in  improving  the 
quality  of  teaching. 

We  may  yet  find  that  certain  ratios  between  height  and  weight, 
certain  ranges  of  body  temperature,  certain  ranges  of  emotional 
characteristics,  certain  qualities  of  vision  and  of  eyesight,  or  cer- 
tain speeds  in  time  reaction,  or  certain  flexibilities  of  memory,  or 
certain  degrees  of  blood  pressure,  are  present  in  good  teachers 
and  not  in  poor  teachers.  It  could  not,  however,  be  stated  as  a 
fundamental  hypothesis  that,  after  a  certain  degree  of  keenness 
of  vision  is  reached,  still  more  keenness  of  vision  will  correlate 
with,  or  bring  about,  better  teaching. 

Moreover,  it  is  reasonable  to  assume  that,  to  the  extent  that 
we  can  determine  relationships  between  effective  teaching  and 
objective,  measurable  facts,  we  shall  advance  toward  skill  in 
the  rating  and  prognosis  of  teaching  ability. 

Suppose,  for  the  moment,  we  found  that  the  older  a  teacher 
becomes,  the  better  teacher  she  also  becomes.  Suppose,  on  the 
other  hand,  that  the  more  poorly  a  teacher  wrote,  the  more  skill- 
ful she  was  in  governing  pupils.  Although  some  qualities  are 
not  constituents  of  teaching  ability,  as  are  intellect  and  faithful- 
ness, nevertheless  they  may  still  serve  usefully  as  indices  of 
teaching  ability. 

If  we  could  get  enough  measurable  facts  about  a  teacher  and 
then  correlate  them  with  teaching  ability,  we  should  be  able  to 
rate  teachers  successfully.  These  measurable  facts  do  exist  and 
our  problem  is  to  discover  them  and  correlate  them. 

In  reviewing  the  correlations  which  have  been  presented,  the 
reader  should  keep  in  mind  the  simple  interpretation  of  the  mean- 
ing of  correlation;  namely,  (1)  that  there  is  perfect  correlation 
between  two  observable  series  of  facts,  if  the  presence  of  one  fact 
means  the  presence  also  of  the  other  fact  in  the  same  way  and  in 
the  same  relative  degree;  (2)  that  there  is  perfect  negative  corre- 
lation, if  the  presence  of  one  fact  meant  the  absence  of  the  other. 
For  example,  if  we  know  that  the  older  a  teacher  is,  the  poorer 
she  is,  then  there  will  be  perfect  negative  correlation  between 
age  and  ability  to  teach. 

Zero  correlation  exists,  if  the  relation  between  the  two  facts  is 
such  as  would  be  produced  by  pure  chance.  Prediction  is  pos- 
sible, if  the  correlations  are  removed  in  size  from  zero.     The 
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greater  the  size  of  the  correlation,  other  things  being  equal,  the 
more  exact  the  prediction. 

The  correlations  between  ability  to  teach  and  other  sets  of  facts, 
which  have  been  found  in  this  study,  after  adjustments  have  been 
made  in  order  that  one  correlation  may  best  represent  the  facts, 
follow. 

Correlations  between: 

Ability  to  Teach  and  Age +  .082 

Ability  to  Teach  and  Salary +  •  348 

Ability  to  Teach  and  Experience + .  041 

Ability  to  Teach  and  Intelligence  as  measured  by  test + .  164 

Ability  to  Teach  and  Handwriting + .  000 

Ability  to  Teach  and  Knowledge  of  Teaching  Technique  as  measured 

by  professional  test + .  608 

Ability  to  Teach  and  Study  while  in  service + .  328 

Ability  to  Teach  and  Normal-school  Scholarship + .  147 

Ability  to  Teach  and  Pupils'  Estimates  of  Teachers + .  784 

Other  significant  correlations  are: 


Test  A  (First  Mental  Test) .  . 
Test  B  (Second  Mental  Test) 
Test  C  (Professional  Test) . . . 


Normal- 

School 

Test  B 

Teste 

Standing 

.812 

.470 

.559 

.584 

.536 

.486 

General  teaching  ability  and  success  in  normal-school  studies 
were  correlated  as  follows:  English,  +.040;  Arithmetic,  +.001; 
Geography,  +.370;  Science,  +.268;  History,  +.235;  Practice 
Teaching,  +.057. 

Intellect,  as  measured  by  test,  correlates  with  ability  to  pass 
a  professional  test  in  about  the  same  degree  as  it  does  with  normal- 
school  standing,  and  ability  to  pass  a  professional  test  correlates 
a  little  lower  with  normal-school  standing  than  does  intellect, 
when  the  results  of  the  two  tests  are  pooled. 

Apparently  the  factor  of  intellect  is  quite  significant  in  normal- 
school  study,  but,  in  comparison  with  other  factors,  it  fades  out 
in  class-room  work. 

Intellect  is  certainly  operative  in  ability  to  pass  a  professional 
test,  but  it  is  uncertain  whether  the  intellectual  factors  which 
operate  in  ability  to  pass  a  professional  test  are  those  which  ac- 
count for  the  correlation  between  ability  to  teach  and  ability  to 
pass  a  professional  test,  since  the  correlation  between  ability  to 
teach  and  ability  as  revealed  in  psychological  tests  is  itself  so  low. 
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THE    MORE    IMPORTANT    TRAITS 

Some  measurable  facts  do  not  appear  to  have  prognostic  value, 
while  others  do.  We  may  now  consider  the  interrelationships  of 
four  traits:  general  ability  to  teach;  ability  to  pass  a  professional 
test;  ability  to  pass  a  mental  test;  and  standing  in  normal  school  or 
normal-school  record. 

All  of  these  traits  or  abilities  are  interrelated.  There  is  some 
correlation  between  a  teacher's  standing  in  normal  school  and 
her  subsequent  abiHty  to  teach,  her  ability  to  pass  a  professional 
test,  her  ability  to  pass  a  mental  test.  Some  relation  exists  be- 
tween each  trait  and  every  other  trait.  The  amount  of  positive 
relationship  between  any  two  traits  which  appears  in  a  simple 
correlation  is  affected  by  the  influence  of  the  other  traits.  The 
interrelationships  are  exceedingly  complex. 

The  problem  may  be  analyzed  as  follows: 

Let  G  represent  general  teaching  ability- 
Let  /  represent  intellect  as  measured  by  test 
Let  P  represent  ability  to  pass  a  professional  test 
Let  A'^  represent  normal-school  record 

G  is  related  to  I 

G  is  related  to  P 

G  is  related  to  N 

I  is  related  to  P 

/  is  related  to  A^ 

P  is  related  to  A'' 
The  GI  relationship  is  related  to  or  affected  by  P 
The  GI  relationship  is  related  to  or  affected  by  N 
The  GI  relationship  is  related  to  or  affected  by  PN 
The  GP  relationship  is  influenced  by  I  and  by  A'' 
The  GN  relationship  is  influenced  by  /  and  by  P 
The  IP  relationship  is  influenced  by  G  and  by  N 
The  GN  relationship  is  influenced  by  /  and  by  P 
The  PA''  relationship  is  influenced  by  G  and  by  I 

The  mutual  relationship  between  ability  to  teach  and  ability 
to  make  scores  in  mental  tests  will  be  affected  in  some  measure 
by  one's  standing  in  normal  school,  because  one's  ability  to  teach 
is  affected  by  what  one  did  in  normal  school  and  one's  standing 
in  normal  school  is  also,  more  or  less,  a  result  of  intellectual 
ability. 

By  a  statistical  procedure  of  partial  correlation  the  true  rela- 
tion may  be  found  in  the  following  cases: 
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G  and  7,  when  factors  P  and  N  are  neutralized  or  non-operative 
G  and  P,  when  factors  I  and  N  are  neutralized  or  non-operative 
G  and  iV,  when  factors  I  and  P  are  neutralized  or  non-operative 

To  make  partial  correlations  we  must  have  measures  in  all 
traits  for  each  person.  It  is  exceedingly  difficult  to  get  measure 
for  all  traits  for  many  cases.  We  have,  however,  satisfactory 
measures  of  teaching  ability,  ability  to  pass  a  professional  test, 
normal-school  record,  and  a  measure  of  intellectual  keenness  for 
29  elementary-school  teachers. 

This  is  too  small  a  number  on  which  to  base  any  sweeping 
conclusion.  The  method  which  has  been  used  is  the  correct  one, 
however,  and  is  best  adapted  to  find  out  what  relationships  exist 
between  teaching  ability  and  certain  measurable  traits.  More 
cases,  other  studies,  further  investigations,  must  be  made  before 
the  question  can  be  finally  answered. 

These  total  correlations  were  discovered: 

General  Teaching  Ability  and  Intellectual  Keenness ± .  000 

General  Teaching  Ability  and  AbiUty  to  Pass  a  Professional  Test ...  -j- .  541 

General  Teaching  Ability  and  Normal-school  Standing -j- .  153 

Intellectual  Keenness  and  AbiUty  to  Pass  a  Professional  Test -\- .  108 

Intellectual  Keenness  and  Normal-school  Standing -\- .  371 

Ability  to  Pass  a  Professional  Test  and  Normal-school  Standing ....  -1- .  560 

These  partial  correlations  were  discovered: 

General  Teaching  Ability  and  Intellectual  Keenness -f .  088 

General  Teaching  Ability  and  Normal-school  Standing — .214 

General  Teaching  Ability  and  Ability  to  Pass  a  Professional  Test ...     -|- .  570 

From  the  partial  correlations  these  deductions  seem  justifiable: 

1.  The  differences  in  mental  keenness  which  are  revealed  in  the 
passing  of  psychological  tests  do  not  correspond  with  differences 
in  teaching  success. 

2.  The  position  that  a  student  in  normal  school  holds  in  her 
class  is  not  indicative  of  her  subsequent  success  as  a  teacher. 

3.  The  relative  success  achieved  in  passing  a  professional  test 
is  correlated  positively  and  highly  with  success  in  teaching. 

4.  Matters  of  such  importance  as  we  have  been  studying  can- 
not be  settled  without  similar  investigation  of  many  more  cases, 
although  in  the  present  study  the  correct  statistical  method  has 
been  followed. 

This  study  indicates  that  age,  experience,  quality  of  hand- 
writing, intelHgence  as  measured  by  tests,  normal-school  stand- 
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ing,  or  the  expressed  interests  of  teachers  are  not  closely  related 
to  success  in  teaching. 

We  have,  however,  an  indication  of  a  mutual  relationship  be- 
tween teaching  and  a  knowledge  of  the  technique  of  teaching 
which  challenges  attention.  Everyone  must  interpret  this  fact 
in  the  light  of  his  own  experience  and  judgment,  until  further 
data  have  been  analyzed. 

If  the  ability  to  pass  a  professional  test  were  an  index  of  teach- 
ing ability,  because  the  teacher  who  teaches  a  long  time  learns 
how  to  teach  and  also  how  to  pass  a  test, — that  is,  if  experience 
or  age  were  the  real  sine  qua  non  of  good  teaching, — then  that 
fact  would  have  appeared  in  our  correlations  between  age 
and  experience  with  general  teaching  ability.  It  did  not  so 
appear. 

Professional  preparation,  as  indicated  by  normal-school  stand- 
ing, does  not  appear  to  account  for  the  +  .570  correlation  between 
teaching  success  and  knowledge  of  technique,  because  normal- 
school  standing,  when  correlated  directly  with  teaching  ability 
(50  cases),  correlated  only  +.333.  In  the  partial  correlation  the 
relation  was  even  slightly  negative. 

Relatively  large  amounts  of  pure  intellectual  alertness  are  not 
uniformly  possessed  by  good  teachers,  while  poor  teachers  uni- 
formly lack  intellectual  alertness.  For,  with  100  cases,  the  corre- 
lation between  success  in  teaching  and  intellect,  as  measured  by 
test,  was  very  low,  and  in  the  partial  correlation  a  zero  relation- 
ship appeared. 

The  most  reasonable  explanation  seems  to  be  along  the  line 
of  a  teacher's  interest  in  her  work.  No  other  explanation  is 
apparent  nor  is  any  other  perhaps  needed.  The  teacher  who  has  a 
genuine  interest  in  her  profession  will  learn  its  technique  and 
hence  will  pass  well  in  a  professional  test.  Those  who  have  not  a 
real  devotion  to  their  art  will  forget,  or  never  take  the  trouble  to 
master,  the  technique  of  their  work.  When,  therefore,  a  test 
which  requires  technical  knowledge  is  given  to  them  without  warn- 
ing, they  will  fail. 

Teaching,  especially  in  the  grades,  will  be  well  done  by  those 
who  are  sensitive  to  its  problems  and  thoughtful  of  their  solution. 
Interest  of  a  substantial  vital  kind  will  explain  the  mutual  rela- 
tionship between  ability  to  pass  a  professional  test  and  success  in 
actual  teaching.     Moreover,  the  abihty  to  pass  a  professional 
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test  may  be  taken  as  an  index  of  real  interest  for,  and  of  probable 
success  in,  teaching. 

If  it  were  within  the  possibilities  of  any  one  study  to  procure 
enough  cases  upon  which  to  base  final  conclusions,  we  could  take 
the  partial  correlations  between  general  teaching  ability  and 
several  measurable  factors  and,  by  combining  in  a  regression 
equation,  say  that  teaching  is  a  composite  of  measurable  factor 
a,  taken  x  times;  factor  b,  taken  y  times;  factor  c,  taken  z  times; 
factor  d,  taken  n  times. 

Then,  in  order  to  rate  teachers  or  to  select  them,  the  proper 
procedure  would  be  to  procure  measures  and  combine  them  into 
a  final  rating.  This  has  not  been  done  and  will  not  be  done,  until 
some  provision  is  made  for  a  competent  investigator  with  a  staff 
of  statisticians  at  his  command  to  have  access  to  at  least  500 
elementary-school  teachers,  distributed  in  several  types  of 
school  systems,  and  studied  for  a  period  of  years.  Until  this 
situation  is  possible,  proper  checking  of  results  is  impossible. 
The  method  which  has  been  used  in  this  study  would  be  in  the 
main  a  satisfactory  procedure  for  such  an  elaborate  study. 


CHAPTER   V 

THE  RELATION  BETWEEN  SPECIFIC  TRAITS  WHEN 
SEVERAL  JUDGES  RATE  THE  SAME  TEACHERS 
IN  THOSE   TRAITS 

THE    THEORY    OF    ANALYSIS 

Teaching  as  a  whole  may  be  analyzed,  for  purposes  of  corl-- 
venience,  into  constituent  parts,  such  as  ability  to  ask  questions; 
ability  to  direct  study;  ability  to  govern;  ability  to  stimulate  the 
moral  health  of  the  community;  and  kindred  abilities.  Theo- 
retically, such  an  analysis  is  possible.  How  much  analysis  of  this 
kind  is  real,  however,  when  the  analysis  is  made  on  a  basis  of 
personal  judgment  is  uncertain. 

In  the  first  part  of  this  study  emphasis  was  placed  on  what 
many  judges  thought  about  a  teacher  in  general.  This  was 
taken  as  an  adequate  basis  of  merit.  Would  it  not  be  better  to 
use  analyzed  judgments? 

School  administrators  have  been  using  of  late  a  score  card  which 
contains  many  traits  of  teaching.  This  score  card  is  used  as  a 
basis,  or  as  a  method,  or  as  a  help,  in  aiding  administrators  to 
form  their  judgments  concerning  teachers.  So  much  has  been 
written  on  the  subject  of  score-card  rating,  and  students  of  educa- 
tional theory  and  practice  are  already  so  sufficiently  informed 
concerning  the  score-card  method  of  rating  teaching,  that  a 
further  review  of  the  literature,  other  than  the  brief  discussion  in 
the  Introduction  of  this  study,  is  redundant. 

Teaching,  in  a  certain  sense,  is  an  organic  unity,  but,  in  a  very 
useful  sense,  it  is  also  a  composite  of  faculties,  or  traits,  or  func- 
tions, all  of  which  are  more  or  less  disparate  and  separable. 
Accepting  teaching  in  the  latter  meaning,  we  get  genuine  insight 
into  the  troublesome  problems  of  estimating  teaching  ability  and 
in  rating  teachers,  if  we  list  the  constituents  of  teaching  ability, 
assign  a  value  to  each  ability,  measure  the  amount  in  which  each 
ability  exists  in  any  given  teacher,  and  thus  compute  a  final  rating 
for  teachers.     The  various  score  cards  which  have  been  devised 
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for  rating  teachers  attempt  to  solve,  in  various  degrees  of  com- 
pleteness, just  this  sort  of  problem.  The  following  data  throw 
light  on  what  actually  happens,  when  analysis  by  personal  judg- 
ments is  attempted. 

DATA    ON    ANALYSIS 

The  best  way  to  present  the  data  is  to  give  a  running  account 
of  how  they  were  obtained.  When  the  teachers  in  Towns  A,  B, 
and  C  rated  each  other  for  general  teaching  ability,  they  also 
rated  each  other  for  specific  qualities.  These  qualities  are  those 
which  could  well  be  considered  as  significant  and  analyzed 
qualities  of  teaching  ability. 

The  complete  instructions  which  were  given  to  the  teachers 
and  the  qualities  which  were  to  be  rated  will  be  seen  from  an 
inspection  of  the  original  instructions  which  are  here  reproduced. 

INSTRUCTIONS    GIVEN   TO    TEACHERS 

On  this  sheet  you  are  requested  to  give  certain  ratings  of  each  teacher  in 
the  list,  including  yourself.  Please  rate  every  teacher,  and  please  be  absolutely 
frank  in  your  ratings.  You  need  not  sign  your  name.  Nobody  will  ever  know 
how  you  or  anybody  else  rated  him.  No  personal  use  will  ever  be  made  of  any 
of  these  ratings.  They  will  be  used  in  a  purely  scientific  study  to  determine 
the  significance  of  age,  education,  early  interests,  etc.,  etc.,  for  success  as  a 
teacher.  The  names  will  all  be  cut  off  and  destroyed  as  soon  as  the  different 
items  in  the  inquiry  have  been  numbered  to  fit  the  ones  to  whom  they  refer. 
Also,  do  not  feel  disturbed  because  in  each  respect  somebody  has  to  be  rated 
lowest.  These  ratings  are  all  relative,  and  the  lowest  teacher  in  the  group  may 
well  be  of  very  great  ability.  Please  be  sure  to  record  ratings  even  if  they 
seem  to  you  to  be  little  better  than  mere  guesses.  The  opinions  of  twenty  men 
will  give  a  useful  rating,  even  if  any  one  of  the  twenty  taken  alone  is  almost 
worthless. 

On  the  sheet  is  a  list  of  the  teachers.  Choose  the  teacher  of  greatest  teaching 
ability  in  the  group  and  write  a  figure  1  after  his  or  her  name  in  column  1. 
Choose  the  teacher  next  below  in  teaching  ability  and  write  2  after  his  or  her 
name  in  colmnn  1.  Write  3  after  the  name  of  the  one  next  in  teaching  ability 
and  so  on  with  4,  5,  6,  etc.  If  two  or  more  seem  absolutely  equal  in  teaching 
abUity  give  them  the  same  rating. 

Then  think  of  the  ability  to  understand  and  ?nanage  people,  to  get  on  with 
other  men,  to  secure  obedience  from  inferiors,  cooperation  from  equals,  and 
consent  and  support  from  superiors  in  school,  business,  or  other  activities. 
Choose  the  teacher  of  greatest  ability  in  the  group  and  write  1  after  his  or  her 
name  in  column  2.     Proceed  as  for  teaching  ability  ranking. 

Then  think  of  intellectual  ability,  the  ability  to  manage  ideas,  to  work  with 
facts,  rules,  and  principles,  to  learn  the  science  of  a  thing,  to  understand 
explanations  and  reasons,  to  think  things  out.     Choose  the  teacher  of  greatest 
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intellectual  abiUty  in  the  group  and  write  1  after  his  or  her  name  in  column  3. 
Proceed  as  for  teaching  ability  rating. 

Then  think  of  the  ability  to  manage  things  and  mechanisms,  the  ability  to  sail 
a  boat,  to  drive  a  motor  car,  to  use  tools,  machines,  and  instruments  of  all  sorts, 
to  be  handy.  Choose  the  teacher  of  greatest  ability  to  manage  things  and 
mechanisms  in  the  group  and  write  1  after  his  or  her  name  in  column  4.  Pro- 
ceed as  for  teaching  abiUty  rating. 

Then  think  of  general  scholarship,  signs  of  education,  knowledge  of  literature, 
etc.  Choose  the  teacher  of  greatest  general  scholarship  in  the  group  and  write 
1  after  his  or  her  name  in  column  5.     Proceed  as  for  teaching  ability  ranking. 

Then  think  of  skill  in  government  or  discipline,  abihty  to  control,  to  keep 
order,  etc.  Choose  the  teacher  of  greatest  skill  in  government  or  discipline 
in  the  group  and  write  1  after  his  or  her  name  in  column  6.  Proceed  as  for 
teacher  rating  ability. 

Then  think  of  instructional  skill,  pure  ability  to  instruct,  correct  and  effective 
methods,  economy  of  time  and  effort,  abiUty  to  get  all  pupils  to  understand 
the  subject-matter.  Choose  the  teacher  of  greatest  ability  in  instructional 
skill  in  the  group  and  write  1  after  his  or  her  name  in  column  7.  Proceed  as 
for  teaching  ability  rating. 

Then  think  of  initiative,  the  making  of  headway,  the  starting  of  new  means, 
the  stating  of  new  ends.  Choose  the  teacher  with  the  greatest  initiative  in  the 
group  and  write  1  after  his  or  her  name  in  column  8.  Proceed  as  for  teaching 
ability  rating. 

Then  think  of  nervous  and  physical  strength.  Choose  the  teacher  of  greatest 
nervous  and  physical  strength  in  the  group  and  write  1  after  liis  or  her  name  in 
column  9.     Proceed  as  for  teaching  ability  rating. 

Then  think  of  that  teacher  who  commands  the  greatest  respect  of  the  pupils 
in  the  group  and  write  1  after  his  or  her  name  in  coliunn  10.  Proceed  as  for 
teaching  ability  rating. 

Finally,  think  of  general  ability  to  get  results.  Choose  the  teacher  with  the 
greatest  ability  to  get  results  in  the  group  and  write  1  after  his  or  her  name 
in  column  11.     Proceed  as  for  teaching  ability  rating. 

The  actual  ratings  for  eleven  traits  made  on  a  prepared  sheet 
(see  illustration  on  next  page)  were  secured  for  the  156  teachers 
in  exactly  the  same  way  as  the  ratings  for  general  teaching 
ability  were  secured. 

The  ratings  for  general  intellectual  ability  and  for  skill  in  dis- 
cipline were  then  treated  as  were  the  estimates  of  general  teaching 
ability.  For  the  six  groups  of  teachers  a  relative  rating  for  the 
qualities,  general  intellectual  ability  and  skill  in  discipline,  were 
obtained.  These  were  turned  into  quantitative  ratings  as  in  the 
case  for  general  teaching  ability.  As  the  statistical  process  was 
the  same  as  that  which  was  described  in  Chapter  II,  under  the 
heading  "Process  of  Rating  Teachers,"  a  description  of  the  pro- 
cedure need  not  be  here  repeated. 
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Names  of  Teachers 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

EXPLANATION   OF   COLUMNS 

Col.    1.  General  ability  as  a  teacher. 

Col.    2.  General  ability  to  manage  people. 

Col.    3.  General  intellectual  ability. 

Col.    4.  Ability  to  manage  things  and  mechanism. 

Col.    5.  General  scholarship. 

Col.    6.  Skill  in  discipline. 

Col.    7.  Ability  to  instruct. 

Col.    8.  Initiative. 

Col.    9.  Nervous  and  physical  strength. 

Col.  10.  Respect  of  pupils. 

Col.  11.  General  ability  to  get  results. 

In  all  instances  half  of  the  ratings  were  selected  by  chance  and 
were  treated  as  the  data  to  form  one  rating,  and  the  other  half 
of  the  ratings  were  used  to  form  another  rating.  Thus  ratings  for 
those  two  qualities  for  the  six  groups  could  be  checked. 


AGREEMENT  BETWEEN  TWO  GROUPS  OF  JUDGES  WHO  JUDGE  THE 
SAME  TEACHERS  FOR  THE  QUALITY  GENERAL  INTELLECTUAL 
ABILITY 

The  correlations  between  the  halves  of  the  judgments  for  the 
qualities,  general  intellectual  ability  and  skill  in  discipline  were 
then  computed.  One  half  of  the  judgments  we  shall  call  Group 
A  and  the  other.  Group  B.  The  reader  will  recall  that  there  was, 
in  the  case  of  mutual  judgments  for  general  teaching  ability,  a 
high  correlation  between  the  two  chance  halves  of  the  judgments 
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in  the  case  of  all  six  groups.     We  have  the  same  condition  pre- 
vailing here. 
These  correlations  are  of  interest: 

r  between  Two 
Groups  of  Judges 
Who  Judge  the 
Same  Teachers 
for  General  Intel- 
lectual Ability  No.  of  Cases 

^  .    ,  Grade  teachers + .  861,  rt .  035  53 

own  A  \  jjjgij.gp}jooi  teachers + .  967,  i .  016  15 

,  Grade  teachers +  .899,  ±  .031  35 

1  own  B  <  jjigh.school  teachers + .  845,  ± .  079  13 


TownC 


Grade  teachers +.958,  ± .014                 30 

High-school  teachers + .  326,  ± .  279                  10 

Average   (weighted  for  number  in  each 

group) + .  879,  ± .  016 

These  correlations  show  that  there  is  close  agreement  among 
the  teachers  as  to  the  distribution  of  general  intellectual  ability 
among  them.  In  this  case  when  two  groups  of  judges  estimate 
the  differences  of  intellectual  capacity  of  a  corps  of  teachers  the 
mutual  agreement  is  on  the  average  +.879 ±.01.  This  is  a 
weighted  average  of  estimates  of  six  different  corps  of  teachers. 

These  correlations  are  also  of  interest: 

AGREEMENT  BETWEEN  TWO  SETS  OF  JUDGES  IN  RATING  A  GROUP  OF  TEACHERS 
FOR  THE   TRAIT  SKILL  IN  DISCIPLINE 

r  between  Two 
Groups  of  Judges 
Who  Judge  the 
Same  Teachers 
for  Skill  in  Dis- 
cipline 

,  Grade  teachers +.943,  ±.015 

iown  A  <,  jjigh.g^.j^ooi  teachers +.896,  ±.050 


Grade  teachers. +.757,  ±.071 

High-school  teachers +  •  581,  ± .  180 

Grade  teachers +  •  728,  ± .  085 

High-school  teachers +.917,  ±.049 

Average  (weighted  for  size  of  group) +  •  838,  ± .  023 


TownB 
Town  G 


With  skill  in  discipline  as  with  general  teaching  ability  and 
general  intellectual  ability,  we  find  that  the  size  of  the  correlation 
indicates  substantial  agreement  among  the  judges,  which  is  very 

4 
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far  from  a  matter  of  chance  that  guesses  or  haphazard  opinions 
would  have  produced. 

This  agreement  between  chance  halves  of  judgments  for  the 
quahties,  intellectual  ability  and  skill  in  discipline,  does  not  prove 
that  intellectual  ability,  as  such,  or  skill  in  discipline,  as  such, 
were  the  traits  actually  rated,  although  an  easy  interpretation  of 
the  data  might  lead  one  to  think  so.  This  agreement  simply 
means  that  on  the  whole  the  judges  had  the  same  quality  or  trait 
in  mind  and  really  did  agree  as  to  the  distribution  of  amounts  of 
the  qualities  or  traits. 

What  agreement  there  would  have  been  concerning  other  traits 
we  do  not  know.  It  is  fair  to  assume,  however,  that  equally  high 
agreement  exists.  The  three  traits  which  we  treated  statistically 
show  uniformly  high  agreement  and  the  enormous  amount  of 
time  required  to  work  out  other  ratings  seems  unnecessary,  when 
the  first  three  treated  show  the  amount  of  agreement  that  is 
present. 

The  important  fact  is  that  when  teachers  rate  each  other  for 
general  teaching  ability,  or  for  a  specific  quality,  such  as  skill  in 
discipline,  chance  halves  of  the  ratings  mutually  correlate  so 
highly  that  substantial  agreement  is  fairly  established.  The 
average  correlation  between  chance  halves  of  judges  when  judging 
the  same  group  of  teachers  for  the  same  qualities  is  +.872. 
This  calculation  is  based  on  the  average  of  eighteen  sets 
of  judgments. 

THE    ABSENCE    OF    ANALYSIS    IN    RATING 

To  find  out  how  much  actual  analysis  is  made  when  judgments 
for  specific  traits  are  recorded,  we  shall  correlate  the  ratings  for 
general  teaching  ability  with  general  intellectual  ability;  general 
teaching  ability  with  skill  in  discipline ;  general  intellectual  ability 
with  skill  in  discipline;  and  then  interpret  the  correlations  which 
have  thus  been  obtained. 

What  relation  is  there  between  ability  to  teach  and  intellectual 
ability  when  both  traits  are  judged  by  mutual  ratings?  As  we 
have  here  two  independent  measures  for  each  trait,  we  can  correct 
for  attenuation  and  get  a  reliable  finding.  The  independent 
measures  are,  of  course,  the  two  ratings  of  the  two  chance  groups, 
"A"  and  "B"  mentioned  before. 
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Town  A 


Town  B 


Town  C 


Grade  teachers .... 
High-school  teachers 
Grade  teachers .... 
High-school  teachers 
Grade  teachers.  .  .  . 
High-school  teachers 


r  between  I  A 
and  II  W 

-l-.927±.019 
-f-. 925  ±.037 
-1-. 899  ±.032 
+  .919  ±.043 
-f.461  ±.143 
+  .944  ±.034 


Average  (weighted  for  size  of  group)       + .  847  ±  .  018 


r  between  II  A 
and  I  W 

+  .802  ±.049 
+  .822  ±.083 
+  .919  ±.026 
+  .791  ±.103 
+  .859  ±.047 
+  .260  ±.294 
+  .819  ±.026 


THE  CORRELATIONS  BETWEEN  (l)  GENERAL  TEACHING  ABILITY  (ll) 
GENERAL  INTELLECTUAL  ABILITY  WHEN  THE  SAME  JUDGES 
JUDGE  THE  SAME  TEACHERS  FOR  TWO  TRAITS  FOLLOW 

r  between 
Traits  1  and 
2,  Corrected 
for  Attenua- 
tion 

+  .957  ±.011 
+  .937  ±.030 
+1.000+' 
+  .925  ±.041 
+  .713  ±.089 
+1.000+' 
+  .935  ±.014 

1 1  A  and  II  B  is  read  the  correlation  between  teaching  ability  as  rated  by  one  group  of 
judges  and  intellectual  capacity  as  rated  by  another  group  of  judges. 

2 II  A  and  I  B  is  read  the  same,  except  that  the  groups  of  judges  estimate  the  traits  in 
reverse  order. 

•The  two  correlations  above  +1.0  of  course  are  wrong  in  the  sense  that  we  could  have 
more  than  perfect  correspondence. 

The  correlation  between  general  teaching  ability  and  general 
intellectual  ability,  when  weighted  for  size  of  groups,  is  +.935 
±  .014.  On  first  glance,  it  would  seem  as  if  there  were  an  astound- 
ingly  high  mutual  relationship  between  ability  to  teach  and  gen- 
eral intellectual  ability. 

The  correlations,  however,  should  be  given  more  than  passing 
notice.  First,  let  us  go  back  to  the  eleven  traits  on  the  original 
rating  sheets.  In  a  sense,  these  original  rating  sheets  might  be 
considered  score  cards  extended  from  one  person  judged  and  one 
person  rating  to  many  persons  judged  and  many  persons  scoring. 
We  might  further  think  that  general  teaching  ability  is  a  compos- 
ite of  the  other  ten  traits  which  have  been  mentioned.  At  least 
these  are  among  the  more  important  traits  mentioned  on  score 
cards  in  general.  Our  correlation  of  +.935 ±.014  between  gen- 
eral teaching  ability  and  general  intellectual  ability  could  be 
variously  interpreted.  It  is  exceedingly  important  that  the 
interpretation  should  be  correct. 

First,  we  might  conclude  that  the  judges  kept  general  teaching 
ability  and  general  intellectual  ability  clearly  distinct  from  each 
other  in  their  minds  when  they  were  rating  and  that  they  actually 
found  that  there  was  this  high  mutual  relationship  between 
intellect  and  pedagogic  skill.  From  this  reasoning  it  might  fairly 
be  held  that  the  stronger  a  teacher  was  the  abler  she  would  be 
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mentally,  and,  conversely,  that  mental  vigor  implies  a  corre- 
sponding degree  of  teaching  power. 

Second,  on  the  other  hand,  this  correlation  of  4-.935±.014 
may  be  interpreted  as  an  inability  on  the  part  of  the  judges 
to  distinguish  effectively  between  teaching  strength  and  intellec- 
tual capacity  in  persons  judged  by  them.  In  other  words,  a  judge 
has  a  certain  opinion  of  a  teacher  in  toto,  and  his  opinion  is  given 
according  to  his  general  impression  in  answer  to  any  significant 
question  about  that  teacher.  Thus,  the  general  estimate  may  be 
taken  to  permeate  all  particular  judgments,  and,  conversely, 
particular  judgments  are  simply  defenses  for,  or  justifications  of, 
the  general  opinion  which  has  thus  been  held. 

To  make  this  still  clearer,  let  us  assume  that  a  person  likes  a 
certain  picture.  If  this  like  is  strong  enough,  it  will  not  vary 
from  whatever  point  of  view  the  picture  may  appear.  Let  it 
stand  on  the  right  of  the  person;  he  will  still  like  it.  Let  him  see 
the  picture  from  the  left;  he  will  still  like  it.  The  total  effect 
being  pleasing,  it  will  not  be  hard  so  to  rationalize  his  thinking 
that  the  background,  the  middle,  and  the  foreground  will  all 
appear  to  be  well  painted.  The  detail  will  be  correct  or  over- 
looked, and  the  main  features  will  be  good  or  easily  condoned. 
We  can  very  well  term  this  process  the  spreading  of  a  halo  of 
general  effect  to  all  particular  parts. 

So  it  might  well  be  in  judging  a  teacher.  Looked  at  from  the 
right  or  the  left,  from  the  aspect  of  intellect  or  from  that  of  gen- 
eral ability  to  teach,  the  general  opinion  will  still  be  present  and 
will  be  the  basis  upon  which  the  judgment  is  formed.  This  is 
apparently  the  most  reasonable  interpretation  of  the  correlation. 
In  many  of  our  school  practices  we  have  assumed  for  ourselves 
the  ability  to  analyze  an  organic  whole  and  an  ability  to  judge  the 
parts  of  a  person,  irrespective  of  the  whole;  but,  when  we  actually 
check  up  our  mental  processes,  we  see  that  this  ability,  if  it  exists 
at  all,  exists  in  a  very  small  degree. 

It  appears  that  this  spread  of  the  general  estimate  enters  into 
our  particular  judgments  to  a  degree  little  before  expected.  For 
it  is  to  be  doubted  if  anyone  would  seriously  hold  that  there  was 
this  correlation  of  +.935=t=.014  which  really  existed  between 
general  teaching  ability  and  general  intellectual  ability. 

The  reader  will  remember  that  in  about  100  cases  we  deter- 
mined intellectual  differences  by  means  of  standardized  tests. 


The  Relation  Between  Specific  Traits 


53 


The  correlation  between  general  teaching  ability  and  intellect,  as 
measured  by  tests,  was  extremely  low  (  +  .164  was  the  average). 
Either  the  tests  are  not  measures  of  intellect  at  all  and  hence  the 
correlation  +.164  is  false,  or  the  judgments  of  intellect  include  so 
many  other  qualities  that  they  really  are  not  judgments  of  intel- 
lect at  all  and  the  +.935  correlation  is  false. 

It  should  be  remembered  that  teachers  are  already  a  highly 
selected  group.  There  could  hardly  be  any  correlation  of  +.935 
between  any  two  traits  which  were  not  practically  identical. 
Since  we  know  that  the  tests  which  were  used  were  more  than 
indifferent  tests  of  what  goes  by  the  name  of  intellect,  we  are 
fairly  correct  in  our  conclusion  that  the  correlation  of  +.935 
between  general  teaching  ability  and  general  intellectual  abihty 
as  estimated  by  judgments  shows  not  an  estimate  of  intellect  to 
have  been  made,  but  rather  an  estimate  of  general  ability  under 
the  name  of  intellect.  The  analysis  in  these  instances  simply  was 
not  made! 


THE    CORRELATION    BETWEEN    ABILITY    TO    TEACH    AND    SKILL    IN 

DISCIPLINE 

We  have  two  separate  measures  of  general  teaching  ability  and 
two  separate  measures  of  skill  in  discipline.  Correlation  between 
the  "A"  group  of  judges'  estimates  for  general  teaching  ability 
and  the  "B"  group  of  judges'  estimates  for  skill  in  discipUne  are 
recorded  under  the  Caption  I  A  and  VI  B.  The  correlations  under 
the  caption  I  B  and  VI  A  are  the  correlations  between  the  "B" 
group  of  judges'  estimates  of  general  teaching  ability  and  the 
"A"  group  of  judges'  estimates  for  skill  in  discipline.  The  third 
column  gives  the  correlations  which  have  been  corrected  for 
attenuation. 

r  between 
Trait  1  and 
Trait  6,  Cor- 
rected      for 
I  A  and  VI  B       IB  and  VI  A        Attenuation 

f  Grade  teachers +.776  ±.055  +.712  ±.068  +.787  ±.052 

°^°      1  High-school  teachers +.650  ±.149  +.829  ±.080  +.789  ±.094 

f  Grade  teachers +.686  ±.089  +.580  ±.112  +.699  ±.085 

°^°      1  High-school  teachers + .  703  ±  .  140  + .  767  ±  .  081  1 .  000  +  i 

_,  I  Grade  teachers +.679  ±.098  +.696  ±.094  +.824  ±.058 

^"^^      \  High-school  teachers +.900  ±.060  +.625  ±.192  +.964  ±.022 

Average  (weighted  for  size  of  group)  + .  741  ±  .  036  + .  703  ±  .  040  + .  789  ±  .  001 

'The  two  correlations  above  +1.0  of  course  are  wrong  in  the  sense  that  we  could  have 
more  than  perfect  correspondence. 
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The  correlation  between  general  teaching  ability  and  skill  in 
discipline,  when  weighted  for  size  of  groups,  is  +.789 ±.001. 
Here  again  we  find  a  higher  correlation  than  we  would  ordinarily 
expect.  It  is  not  higher  than  that  between  general  teaching 
ability  and  general  intellectual  ability,  although  we  would  cer- 
tainly hold  that  it  should  be.  This  is  accounted  for  by  the  fact 
that  disciplinary  skill  can  be  better  judged  than  intellect,  and, 
therefore,  the  tendency  to  spread  a  judgment  might  be  lessened; 
but  there  is  more  of  the  explanation  in  the  fact  that  discipline  was 
the  sixth  trait  that  was  rated.  By  the  time  that  the  sixth  column 
is  reached  there  is  a  fairly  definite  temptation  to  vary  ratings  as 
a  matter  of  principle  or  as  a  device  to  relieve  monotony,  or  simply 
because  one  wants  to. 

It  is  fair  to  assume  that,  if  discipline  had  been  the  second  rather 
than  the  sixth  trait  to  be  rated,  the  correlation  would  have  been 
higher.  By  how  much  is,  of  course,  uncertain.  As  far  as  tradi- 
tion goes  the  correlation  should  have  been  higher  between  general 
teaching  ability  and  skill  in  discipline  than  between  general  teach- 
ing merit  and  intellectual  strength.  For  it  is  everywhere  assumed 
that  in  public-school  teaching,  skill  in  discipline  is  the  first  requi- 
site. The  fact  that  153  teachers  in  groups  rating  each  other  found 
a  higher  mutual  relationship  between  general  teaching  ability  and 
general  intellectual  ability  than  between  general  teaching  ability 
and  skill  in  discipline  is,  to  say  the  least,  interesting.  It  is  also 
hard  to  account  for,  except  by  the  fact  that  judgments  of  particu- 
lar traits  are  really  defenses  of  general  estimate  rather  than  esti- 
mates of  particular  traits  which  have  been  considered  in  isolation. 

Of  course,  governing  skill  is  a  constituent  of  good  teaching, 
but  that  the  true  correlation  is  as  high  as  +.787  is  to  be  much 
doubted.  If  it  were  true,  it  should  mean  that  the  drill  sergeant 
would  be  the  best  teacher.  It  would  also  imply  that  mere  order- 
keeping  was  a  larger  part  of  instruction  than  we  believe  it  to  be. 
The  factor  of  spread  of  general  opinion  is  also  present  here. 

The  correlation  between  general  intellectual  abihty  and  skill  in 
disciphne,  when  weighted  for  size  of  groups,  is  +.719  and  when 
corrected  for  attenuation  is  +.863, ±.020. 

The  correlation  between  Trait  II  and  Trait  VI,  when  corrected 
for  attenuation,  gives  the  final  correlation  between  general  intel- 
lectual ability  and  skill  in  disciphne.  This  correlation  instead  of 
revealing  the  fact  of  the  case  is,  if  taken  at  its  face  value,  nothing 
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short  of  preposterous.  Were  this  really  the  truth,  what  a  prodigy 
of  intellect  the  "strict,"  but  often  dull,  teacher  would  be!  If  we 
thus  generalized,  we  would  also  hold  that  Grant,  admittedly  a  past 
master  in  control,  also  towered  above  Lincoln  in  mental  stature. 

THE    CORRELATION    BETWEEN     GENERAL     INTELLECTUAL     ABILITY     AND     SKILL 

IN    DISCIPLINE 

r  between 
Trait  II  and 
Trait  VI,  Cor- 

fPPtPfi  TOT*  Afi^ 

II  A  and  VI  Bi  II  B  and  VI  A  2       tenuation 

r  Grade  teachers +.700±.070  +.800±.049  +.941 

^^^1  High-schoolteachers....  +.525  ±.187  +.805  ±.090  +.698 

J  Grade  teachers +.756  ±.072  +.609  ±.106  +.824 

own  a  <  jjigij.school  teachers ....  + .  583  ±  .  1 83  + .  789  ±  .  1 04  + .  968 

I  Grade  teachers +.915±.029  +.663±.102  +.932 

^^°\  High-school  teachers....  +.024  ±.316  +.750  ±.297  +.245 

Average  (weighted  for  size  of  group)  + .  697  ±  .  042  + .  741  ±  .  036  + .  863 

III  A  and  VI  B  is  read:  the  correlation  between  Group  A  judges'  estimate  of  general  in- 
tellectual ability  with  Group  B  judgments  of  skill  in  discipline,  when  both  groups  of  judges  are 
judging  the  same  teacher. 

» II  B  and  VI  A  is  similarly  interpreted. 


THE    INFLUENCE    OF    GENERAL    ESTIMATE 

The  factor  of  spread  of  general  opinion  to  particular  traits  is 
here  well  illustrated.  We  must  remember  that  these  teachers 
are  a  relatively  selected  group  for  general  intellectual  ability. 
All  are  graduates  of  normal  school  or  college,  and  this  would  tend 
to  lower  the  correlation.  Of  course,  there  is  some  correlation 
between  general  intellectual  ability  and  skill  in  discipline.  A 
stark  fool  could  not  control  a  class,  but  common  sense  would 
prohibit  us  from  believing  that  any  such  mutual  relationship  as  a 
correlation  of  +.863  suggests  is  the  actual  fact. 

We  would  also  deny,  irrespective  of  these  or  any  other  data 
likely  to  be  presented,  that  there  was  no  closer  relationship 
between  teaching  ability  and  discipline  than  between  intellect 
and  discipline.  And  yet  our  findings,  if  interpreted  literally^ 
show  this. 

This  factor  of  spread  of  general  estimate  can  be  illustrated 
in  another  way.  Allow,  for  the  purpose  of  the  illustration,  that 
the  supervisors'  estimates  of  general  teaching  merit  adequately 
represent  the  facts.  If  we  correlate  what  the  teachers  rated  as 
intellect  with  what  the  supervisors  rated  as  general  ability,  we 
get  a  valuable  evidence  that  teachers  rate  teaching  ability  even 
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when-  they  are  asked  to  rate  general  intellectual  ability.  The 
same  thing  can  be  done  for  teachers'  estimates  for  skill  in  disci- 
pline and  supervisors'  ratings  for  general  teaching  ability.  These 
correlations  are  as  follows: 

r  between  Super- 
r  between  Super-  visors'  Estimate 
visors'  Estimate  of  of  General  Teach- 
General  Teaching  ing  Ability  and 
Ability  and  General  General  Intellec- 
Intellectual  Ability  tual  Ability  as  Es- 
as  Estimated  by  timated  by  Group 
Group    A    Teachers  B  Teachers 

_,  .    f  Grade  teachers + .  883  (52  cases)  + .  869 

1  High-school  teachers +  .840  (15  cases)  +  -885 

„         -r,  i  Grade  teachers -f- .  945  (35  cases)  + .  971 

1  High-school  teachers -h .  611  (13  cases)  + .  750 

rp         „  f  Grade  teachers -|- .  768  (30  cases)  + .  741 

own      s^  High-school  teachers -|- .  477  (10  cases)  -|- .  999 

r  between  Super- 
r  between  Super-  visors'  Estimate 
visors'  Estimate  of  of  General  Teach- 
General  Teaching  ing  Ability  and 
AbiUty  and  Skill  in  Skill  in  Disci- 
DiscipUne  as  Esti-  pline  as  Esti- 
mated by  Group  A  mated  by  Group 
Teachers  B  Teachers 

f  Grade  teachers -|-.786  +.580 

1  own  A  <  jjjgh.school  teachers 4- .  450  + .  708 

I  Grade  teachers +.679  +.729 

I  own  B  <  jjjgj^.gchool  teachers + .  562  + .  847 

_         p  J  Grade  teachers + .  759  + .  956 

1  own  U  j  jjigh.school  teachers + .  890  + .  772 

The  average  correlation,  weighting  for  size  of  group  judgments, 
between  supervisors'  estimates  of  general  teaching  ability  and 
mutual  judgments  of  the  teachers  for  general  intellectual  ability 
is  +.876;  between  supervisors'  estimates  of  general  teaching 
ability  and  mutual  judgments  of  the  teachers  for  skill  in  disci- 
pline, +  .744.  We  could  not  hold  that  any  such  relation  really 
held  between  general  teaching  ability  and  either  general  intellec- 
tual ability  or  skill  in  discipline.  These  correlations  are  another 
and  sufficient  evidence  of  the  fact  that  in  analyzed  judgments  the 
factor  of  the  spread  of  the  general  estimate  is  present  in  a  most 
vicious  form. 
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The  factor  of  spread  is  shown  by  these  data: 

CORRELATIONS,  CORRECTED  FOR  ATTENUATION,  BETWEEN  GENERAL  TEACmNQ 
ABILITY  AND  GENERAL  INTELLECTUAL  ABILITY,  WHEN  BOTH  ARE  JUDGED 
BY   GROUPS   OF   TEACHERS 


Grade  teachers +  .  957 

High-school  teachers +  .  937 

Grade  teachers +1 .  000 

High-school  teachers -|-   .  925 

Grade  teachers -(-   .  713 

High-school  teachers -f  1 .  000 

Average  (weighted  for  size  of  groups) -|-  .  935    ± .  014 


Town  A 
TownB 
TownC 


CORRELATIONS  BETWEEN  GENERAL  TEACHING  ABILITY  AND  SKILL  IN  DISCIPLINE 
WHEN  BOTH   ARE   JUDGED   BY   GROUPS   OF  TEACHERS 

_,         .    f  Grade  teachers +  .  787 

1  High-school  teachers -f-  .  789 

_,        p.  f  Grade  teachers +   .  698 

1  High-school  teachers -f  1 .  000 

_,         p  f  Grade  teachers +   .  824 

\  High-school  teachers -\-  .  964 

Average  (weighted  for  size  of  groups) +   .  789    ± .  041 

CORRELATIONS    BETWEEN    SKILL   IN   DISCIPLINE    AND    GENERAL   INTELLECTUAL 
ABILITY,   WHEN   BOTH   ARE    JUDGED   BY   GROUPS   OF   TEACHERS 

Grade  teachers +  .  941 

High-school  teachers -\-  .  698 

Grade  teachers +  .  824 

High-school  teachers +  .  968 

Grade  teachers +  .  932 

High-school  teachers +  .  245 

Average  (weighted  for  size  of  groups) +  .  863    zb .  025 


Town  A 
TownB 
TownC 


CORRELATION   BETWEEN   ABILITY  TO  TEACH,   AS   JUDGED   BY   SUPERVISORS,   AND 
GENERAL   INTELLECTUAL   ABILITY,    AS   JUDGED    BY   GROUPS    OP   TEACHERS 

Average + .  876 

CORRELATION   BETWEEN   ABILITY   TO   TEACH,   AS   JUDGED  BY  SUPERVISORS,    AND 
SKILL   IN    DISCIPLINE,    AS   JUDGED    BY   GROUPS    OF   TEACHERS 

Average + .  744 

It  would  seem  that,  when  estimates  are  made  of  specific  traits 
and  such  high  correlations  are  obtained  between  the  traits,  a 
damaging  factor  of  spread  of  general  estimate  must  be  allowed 
as  a  fact. 
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Some  conclusions  follow: 

First,  that  teachers,  when  rating  each  other  for  specific  quali- 
ties, such  as  intellect  or  skill  in  discipline,  agree  in  their  estimates. 
This  is  shown  by  correlating  the  ratings  of  the  same  teachers  for 
the  same  trait.     These  correlations  average  +.85'8. 

Second,  that  when  ratings  are  made  for  specific  qualities,  a 
correlation  between  these  ratings  and  those  for  general  teaching 
ability  is  so  high  that  a  very  great  spread  of  the  general  estimate 
is  present  in  the  judgments  for  particular  qualities. 

Third,  this  factor  of  the  halo  of  general  estimate,  being  present 
in  particular  judgments,  is  further  shown  by  correlating  the 
ratings  for  general  teaching  ability  as  they  are  given  by  the  super- 
visors with  ratings  for  particular  qualities  obtained  by  group 
judgments.     Some  average  correlations  are  given  below: 

General  teaching  ability,  as  obtained  by  group  judgments, 

with  general  intellectual  ability,  similarly  obtained + .  935    ± .  014 

General  teaching  ability,  as  obtained  by  group  judgments, 

with  skill  in  discipline,  similarly  obtained +.989    ±.001 

General  intellectual  ability,  as  obtained  by  group  judg- 
ments, and  skill  in  discipline,  similarly  obtained + .  863    =b .  020 

General  teaching  ability,  as  obtained  by  supervisors'  esti- 
mates, with  general  intellectual  ability,  as  obtained  by 
group  judgments + .  876    ± .  020 

General  teaching  ability,  as  obtained  by  supervisors'  esti- 
mates, with  skill  in  discipline + .  744     ± .  091 

Fourth,  when  analysis  is  attempted,  analysis  is  not  obtained^ 
but  ratings  are  obtained  and  these  ratings  are  vitally  influenced 
by  the  general  estimate. 

It  might  be  urged  that  this  factor  of  spread  of  general  estimate 
was  greatly  stimulated  by  the  method  of  scoring,  by  the  nature 
of  instructions  which  were  given,  or  by  other  reasons. 

FAILURE    OF    ATTEMPTS    AT    ANALYSIS 

To  check  up  the  factor  of  spread  in  other  circumstances,  the 
analyzed  ratings  have  been  obtained  of  129  teachers  in  a  New 
York  school  system.  Here  a  regular  Boyce  score  card  was  used 
and  the  teachers  were  rated  by  their  superintendent,  by  their 
respective  principals,  and  by  their  supervisors.  Each  teacher 
was  rated  for  forty-five  distinct  qualities.  A  list  of  these  quali- 
ties is  found  on  page  64. 

Some  of  the  usual  errors  in  rating  were  present.     One  error 
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that  is  worth  mentioning  is  that,  although  the  instructions  specifi- 
cally pointed  out  that  good  means  above  the  average,  the  distribu- 
tion of  ratings  were  in  part  skewed  somewhat  sharply  from  a 
normal  distribution. 

Within  any  group  of  sufficient  size  it  may  be  assumed  for  statis- 
tical purposes  that  the  following  distribution  will  satisfactorily 
represent  the  facts:  10  per  cent  very  poor;  20  per  cent  poor; 
40  per  cent  medium  or  average;  20  per  cent  good;  and  10  per 
cent  excellent. 

The  distributions  of  the  ratings,  which  were  given,  follow: 

General  Teaching  Ability: 

No.  Per  Cent  Per  Cent 

Very  poor 0  0.0  normally  should  be  10 

Poor 5  3.9  normally  should  be  20 

Medium 34  27 . 8  normally  should  be  40 

Good 71  55 . 0  normally  should  be  20 

Excellent 19  14.7  normally  should  be  10 

Skill  in  Discipline: 

No.  Per  Cent  Per  Cent 

Very  poor 1           0.7  normally  should  be  10 

Poor 4          3.0  normally  should  be  20 

Medium.  .". 27  20.9  normally  should  be  40 

Good 61  47.2  normally  should  be  20 

Excellent 36  27 . 9  normally  should  be  10 

General  Intellectual  AbiUty: 

No.         Per  Cent  Per  Cent 

Very  poor 0  0.0  normally  should  be  10 

Poor 0  0.0  normally  should  be  20 

Medium 32  24 . 8  normally  should  be  40 

Good 80  62.0  normally  should  be  20 

Excellent 17  13.1  normally  should  be  10 

The  distributions  for  the  ratings  in  voice  are  similarly  massed 
and  are  also  high.  The  distributions  for  the  other  traits,  for 
which  ratings  were  made,  have  not  been  worked  out.  The  extent 
of  the  mismarking  can  be  seen  in  the  case  of  general  intellectual 
ability. 

Assuming  the  least  possible  error  and  assuming  the  approxi- 
mate truth  that  a  normal  distribution  of  mental  strength  will  be 
present  in  a  group  as  large  as  that  which  has  been  here  con- 
sidered, both  of  which  assumptions  may  fairly  be  made,  we  found 
that  only  17  per  cent  of  the  teachers  received  a  proper  rating. 
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We  also  found  16  per  cent  of  the  teachers  were  rated  two  steps 
too  high.     The  remainder  were  misrated  by  one  step. 

All  were  "rated  up."  The  same  fault  thus  appeared  in  score- 
card  rating  as  in  general-estimate  rating.  This  skewing  of  the 
distributions  is  of  importance  to  us,  not  because  it  shows  that 
actual  ratings  hardly  correspond  with  the  probable  facts,  but 
because  it  reduces  the  number  of  groups  or  the  spread  of  the  dis- 
tribution. When  correlations  between  traits  are  computed  from 
data  so  greatly  restricted  in  range,  the  correlations  are  lowered 
considerably  not  because  a  low  correlation  is  the  ultimate  fact, 
but  because  the  lack  of  spread  in  the  distribution  reduces  the  cor- 
relation mathematically.  This  rather  technical  consideration 
need  not  unduly  concern  us,  for  the  factor  of  spread  of  judgment 
may  be  shown  in  quite  another  way,  through  correlations  between 
qualities  so  large  that  only  undue  spread  of  judgment  can  ac- 
count for  them. 

Eight  correlations  of  traits  follow: 

General  teaching  ability  with  general  intellectual  ability ...  + .  677,  ± .  03 

General  teaching  abihty  with  skill  in  discipline + .  787,  ± .  02 

General  teaching  ability  with  voice + .  632,  ± .  04 

General  intellectual  ability  with  voice + .  625,  ± .  04 

General  intellectual  ability  with  skill  in  discipUne +.560,  ±.04 

Voice  with  interest  in  community + .  500,  ± .  04 

Voice  with  skill  in  discipUne + .  438,  ± .  06 

SkiU  in  discipline  with  morals + .  333,  ± .  11 

These  ratings  were  made,  of  course,  entirely  independent  of 
this  study  and  under  circumstances  calling  for  unusual  care  and 
thoroughness. 

Common  sense  would  tell  us  that  the  correlation  between 
voice — defined  on  the  score  card  as  ''voice — pitch,  quality,  clear- 
ness of  school-room  voice" — and  interest  in  community  is  probably 
zero,  but  here  it  was  found  to  be  -f.500,  while  voice  and  discipUne 
was  -f  .438,  and  general  intellectual  capacity  and  voice  was  -f.625. 
The  sizes  of  the  correlations  do  not  correspond  to  the  importance 
of  the  relationships. 

These  data  are  worthy  of  more  extended  treatment  than  the 
correlations  on  page  61  would  indicate.  The  inter-correlations 
(120  in  number)  have  been  computed  for  the  following  traits: 
general  appearance,  health,  voice,  intellectual  capacity,  accuracy, 
self-control,  sense  of  justice,   academic  preparation,  interest  in 
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the  life  of  community,  ability  to  meet  and  interest  patrons,  pro- 
fessional interest  and  growth,  use  of  English,  discipline  (govern- 
ing skill),  attention  to  individual  needs,  and  general  development 
of  pupils. 

The  Correlations  between  15  Traits  as  Found  in  the  Score-Card 
Rating  of  129  Teachers  of  a  City  in  New  York  State 


Trait 


1.  General  Appearance 

2.  Health 

3.  Voice 

4.  Intellectual  Capacity 

7.  Accuracy 

11.  Self-Control 

14.  Sense  of  Justice 

15.  Academic   Preparation 

20.  Interest   in  the  Life  of  the 

Community 

21.  Ability  to  Meet  and  Inter- 

est Patrons 

24.  Professional      Interest     and 

Growth 

26.  Use  of  English 

30.  Discipline 

40.  Attention      to       Individual 

Need 

43.  General      Development      of 

Pupils 


727 


20 


21 


24 


26 


30 


513 
491 
438 
560 
461 
503 
647 
419 

362 

584 

621 
572 


40 


43 


45 


482 
427 
172 
173 
218 
467 
529 
423 

376 

499 

589 
427 
333 

621 

290 


The  most  obvious  fact  about  these  correlations  is  monotonous 
similarity.  They  do  not  vary  with  the  relevance  of  the  relation- 
ships. Most  of  them  are  too  high.  This  fact  illustrates  again 
the  factor  of  spread  of  general  estimate,  which  can  be  shown  in 
still  a  different  fashion. 

The  distribution  of  the  correlations  follows: 

Frequency  Correlation  Range 

4  -i-.l  to  +.2 

2  +.2to  +.3 

15  -f.3to+.4 

38  +.4  to +.5 

30  +.5to+.6 

19  -f.6to+.7 

8       +.7to+.8 

3       +.8to4-.9 

True  average +  •  5 
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1.  Suppose  there  were  present  in  these  correlations  100  per  cent 
spread  of  general  estimate,  and  that  the  correlation  which  was 
typical  of  the  halo  effect  was  +.5,  we  should  not  expect  all  the 
correlations  to  be  exactly  +.5.  They  would  vary  or  be  grouped 
around  +.5  as  a  mean  in  a  normal  probability  curve.  That  is, 
they  would  tend  to  be  at  +-5,  but  some  would  be  a  little  above 
and  some  a  little  below.  The  probable  error  of  the  +.5  correla- 
tion, the  number  of  individuals  being  126,  is  ±.068  by  the  for- 

Using  ±  .068  as  the  S.  D.  of  the  probability 


mula  S.D.= 


V. 


curve,  we  can  plot  the  position  of  the  120  correlations  as  they 
would  occur,  if  pure  chance  only  were  operating. 

If  we  place  the  distribution  of  the  correlations  as  we  have  them 
upon  a  distribution  as  they  would  occur  by  pure  chance,  then  we 
can  see  very  clearly  what  part  of  the  total  number  of  correlations 
need  other  explanation  than  that  of  pure  chance  variation  from  a 
typical  one.     The  following  chart  shows  this  comparison: 


8  20D2 


8   3002     « 


468eooz«  *■  t7oot.««  t  too  2.  * 


8  90O 


COMPARISON  OP  NORMAL  CURVE  AND  DISTRIBtJTION  OP  CORRELATIONS.  THE 
SOLID  LINE  IS  THE  NORMAL  CURVE,  THE  BROKEN  LINE  THE  DISTRIBUTION 
OP   CORRELATIONS.      CENTRAL  TENDENCY  IN   BOTH   CASES    +.5. 


2.  Of  the  120  correlations,  15  lie  beyond  the  limits  of  pure- 
chance  variations  from  a  mean  and  105  lie  within  the  limits  of 
chance  variations.  Within  these  limits  the  position  of  the  correla- 
tions are  not  far  dissimilar  from  what  they  would  be,  if  chance 
only  were  operating.  For  105  of  the  correlations  no  other  facts 
than  100  per  cent  spread  of  general  estimate  and  chance  variation 
are  needed. 
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The  reader  must  determine  for  himself  how  significant  the  other 
15  are.  The  15  correlations  which  are  not  explainable  by 
chance  follow: 

Voice  and  Moral  Influence + .  172 

Intellectual  Capacity  and  Moral  Influence + .  173 

InteUectual  Capacity  and  Academic  Preparation + .  126 

Academic  Preparation  and  Ability  to  Meet  and  Interest  Patrons ....  + .  158 

Accuracy  and  Moral  Influence + .  218 

General  Appearance  and  Health +  ■  727 

Accuracy  and  Attention  to  Individual  Needs 4- .  725 

Self -Control  and  Sense  of  Justice + .  872 

Sense  of  Justice  and  Ability  to  Meet  and  Interest  Patrons + .  766 

Sense  of  Justice  and  Use  of  English + .  771 

Sense  of  Justice  and  Attention  to  Individual  Needs + .  822 

Professional  Interest  and  Growth  and  Use  of  English +  ■  748 

Use  of  EngUsh  and  General  Development  of  Pupils + .  720 

Discipline  and  General  Development  of  Pupils + .  787 

Attention  to  Individual  Needs  and  General  Development  of  Pupils. .  + . 807 

It  will  be  seen  that  5  of  these  correlations  are  too  low  and  10  are 
too  high  for  explanation  on  a  basis  of  mathematical  chance. 
Take  the  correlation  of  voice  with  moral  influence  of  +.172. 
Why  should  voice  correlate  with  moral  influence  so  loosely  and  cor- 
relate +.682  with  intellect,  .628  with  accuracy,  .454  with  academic 
preparation,  and  .500  with  interest  in  the  life  of  the  community. 
Why  should  voice  correlate  with  accuracy  as  highly  as  it  does  with 
skill  in  discipline?  In  fact,  it  is  not  clear  why  the  5  especially  low 
correlations  are  the  ones  which  they  happen  to  be,  instead  of 
being  a  part  of  any  other  50  that  one  might  pick  almost  at  random. 
The  correlations  that  are  so  high  that  they  are  beyond  the  range  of 
chance  variation  from  the  mean  are  also  a  little  difficult  to  ex- 
plain from  any  necessary  relationships.  Three  of  them  have 
general  development  of  the  pupils  as  one  factor  and  use  of  Eng- 
lish, discipline,  attention  to  individual  needs,  as  the  others. 
Very  likely  these  show  true  relationships,  but  the  same  data  show 
equally  as  high  mutual  relationships  between  sense  of  justice  and 
use  of  English;  sense  of  justice  and  ability  to  meet  people;  self- 
control  and  sense  of  justice.  Of  the  120  correlations  105,  or  87 
per  cent,  could  be  explained  by  mere-chance  variation,  if  the  state- 
ment "there  is  as  much  correlation  between  any  two  traits  as  be- 
tween any  other  two"  were  literally  true.  The  remaining  15 
coefficients  of  correlation  perhaps  can  best  be  accounted  for  by  a 
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mixture  of  true  insight  and  more  or  less  of  the  usual  amount  of 
tendency  to  spread  one's  general  estimate  over  particular  judg- 
ments. 

DATA    ON    BOYCe's    SCORE    CARD 

The  score  card  upon  which  the  ratings  used  in  the  last  section 
were  made  is  the  work  of  A.  C.  Boyce.  His  study  is  reported  in 
the  Fourteenth  Year-Book  of  the  National  Society  for  the  Study  of 
Education,  Part  IV.  It  is  reported  in  the  introduction  of  this 
study.  This  study  is  perhaps  the  most  extended  and  the  best  one 
on  the  rating  of  teachers.  Boyce's  original  data,  as  reported,  con- 
tain some  very  good  evidences  of  the  factor  of  spread  of  general 
estimate.  As  this  point  is  not  stressed  in  the  Boyce  report,  it 
might  be  well  to  close  this  discussion  with  a  consideration  of  that 
report. 

Boyce  found  the  following  correlations  between  general  teach- 
ing ability  and  forty-five  traits: 

General  Teaching  Ability  with  r  Rank 

1.  General  Appearance + .  47  43 

2.  Health +.56  39 

3.  Voice +.53  42 

4.  Intellectual  Capacity + .  62  34 

5.  Initiative  and  Self-reliance + .  77  13 

6.  Adaptability  and  Resourcefulness +  .80  11 

7.  Accuracy +.74  17 

8.  Industry +  .69  24 

9.  Enthusiasm  and  Optimism + .  71  22 

10.  Integrity  and  Sincerity + .  63  33 

11.  Self-control +.66  30 

12.  Promptness +  .66  29 

13.  Tact +.69  25 

14.  Sense  of  Justice + .  61  36 

15.  Academic  Preparation + .  41  44 

16.  Professional  Preparation + .  38  45 

17.  Grasp  of  Subject-matter +.72  19 

18.  Understanding  of  Children + .  76  15 

19.  Interest  in  the  Life  of  the  School + .  65  31 

20.  Interest  in  the  Life  of  the  Community + .  62  35 

21.  AbiUty  to  Meet  and  Interest  Patrons +.61  38 

22.  Interest  in  Lives  of  Pupils +  .69  26 

23.  Cooperation  and  Loyalty + .  66  28 

24.  Professional  Interest  and  Growth + .  72  18 

25.  Daily  Preparation + .  68  27 

26.  Use  of  Enghsh +.55  40 

27.  Care  of  Light,  Heat,  and  Ventilation + .  61  37 
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General  Teaching  Ability  with  r  Bank 

28.  Neatness  of  Room + .  54  41 

29.  Care  of  Routine +.64  32 

30.  Discipline  (Governing  Skill) + .  79  12 

31.  Definiteness  and  Clearness  of  Aim +.81  10 

32.  Skill  in  Habit  Formation +  .86  5 

33.  Skill  in  Stimulating  Thought + .  84  8 

34.  Skill  in  Teaching  How  to  Study + .  84  7 

35.  Skill  in  Questioning +  .72  20 

36.  Choice  of  Subject-matter + .  85  6 

37.  Organization  of  Subject-matter +  .87  3 

38.  Skill  and  Care  in  Assignment +  .82  9 

39.  Skill  in  Motivating  Work +.74  16 

40.  Attention  to  Individual  Needs +  .76  14 

41.  Attention  and  Response  of  the  Class +.86  4 

42.  Growth  of  Pupils  in  Subject-matter +.87  2 

43.  General  Development  of  Pupils +.88  1 

44.  Stimulation  of  Community + .  70  23 

45.  Moral  Influence +.71  21 

These  correlations  cannot  be  taken  at  their  face  value  for  two 
reasons:  (a)  They  have  not  been  corrected  for  attenuation  and, 
hence,  are  far  too  low.  (6)  The  procedure  by  which  they  were 
obtained  injects  an  error  which  would  make  them  too  high. 

These  correlations  are  based  upon  data  collected  from  39 
schools.  That  is,  in  39  schools  some  judge  rated  the  teachers  for 
the  traits  which  have  been  mentioned.  All  the  ratings  were  then 
put  on  one  correlation  table  and  the  mutual  relationships  were 
worked  out.  Had  the  correlation  for  each  of  the  39  original  rat- 
ings been  worked  out  and  the  mean  and  variability  of  the  distri- 
bution of  these  correlations  been  given,  we  should  know  what  we 
had. 

When,  however,  ratings  from  39  sets  of  teachers  and  from  as 
many  different  judges  are  combined  before  they  are  correlated, 
we  do  not  know  just  what  we  have.  At  best,  the  resulting  corre- 
lations form  a  composite  of  the  correlations  between  the  respective 
pairs  of  traits,  plus  an  erroneous  pooling  of  sets  of  data.  The  re- 
sult is  by  no  means  a  simple  correlation  between  the  traits,  such 
as  Boyce  took  for  granted.  Boyce  gave  as  statistical  reference 
Thorndike's  Mental  and  Social  Measurement,  page  172  et  seq. 
Neither  Thorndike  nor  any  other  statistician  would  justify  statis- 
tical liberties  of  the  kind  that  Boyce  took.  It  is  also  impossible 
to  compute  the  reliability  of  his  coefficients  of  correlation. 
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Assuming,  moreover,  that  these  two  errors — lack  of  correction 
and  improper  treatment  of  data — check  each  other  and,  by  good 
fortune,  the  correlations,  as  presented,  are  true,  what  are  the 
probabilities  of  the  correlations  harboring  a  vicious  spread  of 
general  estimate? 

The  only  reference  Boyce  made  to  the  factor  of  spread  or 
absence  of  analysis  is  on  page  42,  when  he  discussed  the  value  of 
judging  for  separate  traits.  "  The  topics  must  not  be  too  few,"  he 
said,  ''for  either  they  will  be  so  general  that  little  analysis  is  made, 
or,  if  not  general,  they  will  be  sure  to  leave  out  important  points." 

In  discussing  the  significance  of  the  correlations  between  general 
teaching  ability  and  specific  traits,  Boyce  does  not  seem  to  have 
been  much  impressed  with  the  factor  of  spread.  Before  assuming 
its  absence,  however,  he  should  have  computed  the  correlations 
between  the  respective  pairs  of  traits.  In  other  data  we  find  that 
the  correlations  range  the  same  as  do  the  correlations  between 
specific  traits  and  general  teaching  ability.  This  has  not  been 
done  by  Boyce,  nor  can  it  be  done  from  any  of  his  reported  data. 
The  best  evidence  for  the  presence  or  absence  of  spread  of  general 
estimate  is,  therefore,  not  available. 

The  distribution  is  as  follows: 

Frequency  '  Correlation  Range 

+  .300  +.350 

1         +.350  +.400 

2         +.400  +.450 

3         +.450  +.500 

4         +.500  +.550 

5         +.550  +.600 

7         +.600  +.650 

8         +.650  +.700 

8         +.700  +.750 

4         +.750  +.800 

5         +.800  +.850 

6         +.850  +.900 

The  average  correlation  between  general  teaching  ability  and 
specific  traits  is  + .  70.  This  is,  of  course,  far  too  high.  Un- 
fortunately, we  do  not  know  what  variations  from  + .  70  pure 
chance  would  explain.  Within  a  variation  of  =t .  10,  60  per  cent 
of  the  correlations  fall.  Further,  within  a  variation  of  ±  .  15,  85 
per  cent  of  the  correlations  fall.  If  teaching  is  a  complex  process 
and  if  the  traits  recorded  are  distinct  and  specific,  even  to  a  small 
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degree,  the  fact  that  85  per  cent  of  the  correlations  are  found 
within  a  range  of  =»=  .15  is  exceedingly  suggestive  of  the  presence 
of  this  spread  of  general  estimate. 

A  cursory  examination  of  the  individual  correlations  suggests 
the  same  thing.  There  is  only  .03  difference  between  the  correla- 
tion of  academic  preparation  and  the  correlation  of  professional 
preparation  with  general  teaching  ability !  Within  a  range  of  .  04 
come  the  correlations  of  industry,  enthusiasm,  tact,  grasp  of  sub- 
ject-matter, interest  in  lives  of  the  pupils,  professional  growth, 
daily  preparation,  skill  in  questioning,  moral  influence,  stimula- 
tion of  community  with  general  teaching  ability.  The  least  im- 
portant trait  of  all  is  professional  preparation !  Within  a  ±  .  04 
the  following  are  of  the  same  significance,  care  of  routine,  intel- 
lectual capacity,  integrity  and  sincerity,  self-control,  promptness, 
sense  of  justice,  interest  in  the  life  of  the  school,  interest  in  the  life 
of  community,  interest  ability  to  meet  and  interest  patrons, 
cooperation  and  loyalty,  daily  preparation,  care  of  light,  heat, 
ventilation.  This  decided  monotony  of  the  size  of  the  correla- 
tions, which  are  obviously  too  high,  is  patent  witness  of  the  pres- 
ence of  spread  of  general  estimate. 

In  our  consideration  of  the  correlations  between  general  teach- 
ing ability,  intellectual  strength,  and  skill  in  discipline  for  Towns 
A,  B,  C,  the  fact  that  analysis  of  general  worth  into  specific  traits 
was  not  as  complete  as  one  would  have  ordinarily  supposed,  is 
statistically  demonstrated.  When  the  ratings  of  120  teachers  in 
a  New  York  school  system  for  45  separate  traits  were  examined, 
the  evidence  again  showed  that  analyzed  judgments  are  far  from 
being  beyond  question. 

In  Boyce's  study,  as  reported,  while  complete  statistical  treat- 
ment is  out  of  the  question,  the  correlations,  as  given,  do  not  show 
the  range  that  common  sense  would  lead  us  to  expect.  Their 
monotonous  similarity  also  suggests  that,  when  analyzed  judg- 
ments are  attempted,  the  influence  of  general  estimate  is  so  strong 
that  the  resulting  analyses  are  perhaps  even  more  justifications 
of  the  general  estimates  than  they  are  judgments  of  the  specific 
trait. 

The  purpose  of  this  chapter  has  been  to  present  data  which 
show  that  general  estimate  permeates  judgments  of  specific 
traits  to  a  degree  which  has  not  hitherto  been  sufficiently  empha- 
sized. 


